我正在使用Ruby 1.8和FasterCSV。
我正在阅读的csv文件有几个重复的列。
| acct_id | amount | acct_num | color | acct_id | acct_type | acct_num |
| 345 | 12.34 | 123 | red | 345 | 'savings' | 123 |
| 678 | 11.34 | 432 | green | 678 | 'savings' | 432 |
...等
我想将其浓缩为:
| acct_id | amount | acct_num | color | acct_type |
| 345 | 12.34 | 123 | red | 'savings' |
| 678 | 11.34 | 432 | green | 'savings' |
有通用的方法吗?
目前我的解决方案如下:
headers = CSV.read_line(file)
headers = CSV.read_line # get rid of garbage line between headers and data
FasterCSV.filter(file, :headers => headers) do |row|
row.delete(6) #delete second acct_num field
row.delete(4) #delete second acct_id field
# additional processing on the data
row['color'] = color_to_number(row['color'])
row['acct_type'] = acct_type_to_number(row['acct_type'])
end
答案 0 :(得分:1)
假设你想要摆脱硬编码的删除
row.delete(6) #delete second acct_num field
row.delete(4) #delete second acct_id field
可以替换为
row = row.to_hash
这将破坏重复。其余的发布代码将继续有效。