ruby - How do I remove duplicate rows in my CSV? -

- August 15, 2013

i have csv has data this:

a.a.b. direct   http://www.aabdirect.com    348 willis ave  mineola ny  11501   (800) 382-1002  no email abeam consulting inc    http://abeam.com    245 park ave    new york    ny  10167   (212) 372-8783  no email abeam consulting inc    http://abeam.com    245 park ave    new york    ny  10167   (212) 372-8783  no email alvarez & marsal    http://www.alvarezandmarsal.com 600 madison ave new york    ny  10022   (212) 759-4433  no email alvarez & marsal    http://www.alvarezandmarsal.com 600 lexington ave ste 6 new york    ny  10022   (212) 759-4433  no email

the key thing here columns in both rows match (like abeam consulting inc), that's not case. websites match, or phone number or name match.

the key thing website. if 2 values have same website, want one.

how de-dupe list in non n+1 way?

preferably native ruby method .uniq or of sort.

just read strings (which i"ve simplified avoid need horizontal scrolling) array:

arr = [   "a.a.b. direct   http://www.aabdirect.com    (800) 382-1002",   "abeam consulting inc    http://abeam.com    (212) 372-8783",   "abeam consulting inc    http://abeam.com    (212) 372-8783",   "alvarez & marsal    http://www.alvarezandmarsal.com (212) 759-4433",   "alvarez & marsal    http://www.alvarezandmarsal.com 10022   (212) 759-4433" ]

and, suggest, use array#uniq, block:

arr.uniq { |line| line[/\shttp:\s+/] }   #=> ["a.a.b. direct   http://www.aabdirect.com    (800) 382-1002",   #    "abeam consulting inc    http://abeam.com    (212) 372-8783",   #    "alvarez & marsal    http://www.alvarezandmarsal.com (212) 759-4433"]

see array#uniq. regex /\shttp:\s+/ reads, "match whitespace followed string "http:", followed 1 or more characters other whitespaces (greedily)".

Search This Blog

Swift

ruby - How do I remove duplicate rows in my CSV? -

Comments

Post a Comment

Popular posts from this blog

asp.net - How to correctly use QUERY_STRING in ISAPI rewrite? -

jsf - "PropertyNotWritableException: Illegal Syntax for Set Operation" error when setting value in bean -

php - How to display all orders for a single product showing the most recent first? Woocommerce -