合并两个表(CSV)if(table1列A == table2列A)

时间:2014-02-14 00:43:44

标签: python excel csv formatting

我有两个CSV,可在Numbers或Excel中打开,结构为:
| word | num1 |
 和
| word | num2 |

如果这两个词是相同的(就像他们都是'hi'和'hi')我希望它成为:
| word | num1 | num2 |

这里有一些图片:

table1 table2

因此对于第1行,因为两个单词都是相同的,'TRUE',我希望它变成类似的东西 | TRUE | 5.371748 | 4.48957 |

通过一些小脚本,或者如果有一些功能/功能,我会忽略 谢谢!

3 个答案:

答案 0 :(得分:4)

对于csv,我始终可以访问数据分析库pandashttp://pandas.pydata.org/

import pandas as pd

df1 = pd.read_csv('file1.csv', names=['word','num1'])
df2 = pd.read_csv('file2.csv', names=['word','num2'])
df3 = pd.merge(df1, df2, on='word')
df3.to_csv('merged_data.csv')

答案 1 :(得分:1)

使用dict:

with open('file1.csv', 'rb') as file_a, open('file2.csv', 'rb') as file_b:
    data_a = csv.reader(file_a)
    data_b = dict(csv.reader(file_b))  # <-- dict
    with open('out.csv', 'wb') as file_out:
        csv_out = csv.writer(file_out)
        for word, num_a in data_a:
            csv_out.writerow([word, num_a, data_b.get(word, '')])  # <-- edit

(未测试的)

答案 2 :(得分:0)

我认为你要找的是zip,让你以锁定步骤迭代这两个CSV:

with open('file1.csv', 'rb') as f1, open('file2.csv', 'rb') as f2:
    r1, r2 = csv.reader(f1), csv.reader(f2)
    with open('out.csv', 'wb') as fout:
        w = csv.writer(fout)
        for row1, row2 in zip(r1, r2):
            if row1[0] == row2[0]:
                w.writerow([row1[0], row1[1], row2[1]])

如果他们相等,我不确定你想要发生什么。也许插入两行,像这样?

            else:
                w.writerow([row1[0], row1[1], ''])
                w.writerow([row2[0], '', row2[1]])