Excel - combining two csv files into one with a common column -
i working 2 datasets in csv
form (movielens latest-small dataset). given below fields of both.
rating.csv
user_id movie_id rating
movie.csv
movie_id movie_name
what want combine them single .csv following fields
user_id movie_id movie_name rating
so common column movie_id
maps corresponding movie_name
.
could done using excel? if not, how can it?
i need dataset recommender engine, simple solution welcome end result matters. since i've experience in java easy easy understand , implement.
if there way using excel best. have tried searching online , found vlookup
method couldn't it.
also tried online merging tools attached sheets 1 after not mapping column. have no problem using online tools too.
check out tool - https://github.com/datafoxco/gocsv - it's based off of csvkit has ton of additional features. 1 of our engineers custom built - , open sourced solve of these data issues deal every day :)
it vlookup of sized csv in merely seconds using join command:
gocsv join --columns 'movie_id','movie_id' --left rating.csv movie.csv > combineddata.csv
then if still want reorder columns, can too:
gocsv select --columns 'user_id','movie_id','movie_name','rating' combineddata.csv > combineddata-final.csv
i split commands out top explain use - documentation has examples on recommend pipelining , doing in 1 command this:
cat rating.csv \ | gocsv join --left --columns 'movie_id','movie_id' movie.csv \ | gocsv select --columns 'user_id','movie_id','movie_name','rating' > combineddata.csv
Comments
Post a Comment