我正在尝试创建一个程序,它将比较两个csv文件并在新的csv文件中显示结果。 在csv文件中,单元格也具有文本值和整数值。我想如果发生更改并且单元格值是TEXT,它应该在新的csv文件中对该值附加True,如果发生更改并且单元格值为Integer,则应该附加此文本"结果为正:更改值& #34;和"结果是否定的:价值变化"
以下是代码:
import csv
with open('book1.csv', 'r') as t1:
old_csv = t1.readlines()
with open('book2.csv', 'r') as t2:
new_csv = t2.readlines()
with open('update.csv', 'w') as out_file:
line_in_new = 0
line_in_old = 0
while line_in_new < len(new_csv) and line_in_old < len(old_csv):
if old_csv[line_in_old] != new_csv[line_in_new]:
out_file.write(new_csv[line_in_new])
else:
line_in_old += 1
line_in_new += 1
请指导。
EDITED
您好我也尝试了不同的方法,但收到了KeyError:&#34; [&#39; XID&#39;]不在索引&#34;
请查看同一主题的我的其他代码
import pandas as pd
file1 = 'Book1.csv'
file2 = 'Book2.csv'
file3 = 'update.csv'
cols_to_show = ['XID', 'TCO', 'Payment Plan','Livable Area','Brochure', 'Banks']
old = pd.read_csv(file1)
new = pd.read_csv(file2)
def report_diff(x):
return x[0] if x[1] == x[0] else '{0} --> {1}'.format(*x)
old['version'] = 'old'
new['version'] = 'new'
full_set = pd.concat([old, new], ignore_index=True)
changes = full_set.drop_duplicates(subset=cols_to_show, keep='last')
dupe_names = changes.set_index('XID').index.get_duplicates()
dupes = changes[changes['XID'].isin(dupe_names)]
change_new = dupes[(dupes['version'] == 'new')]
change_old = dupes[(dupes['version'] == 'old')]
change_new = change_new.drop(['version'], axis=1)
change_old = change_old.drop(['version'], axis=1)
change_new.set_index('XID', inplace=True)
change_old.set_index('XID', inplace=True)
diff_panel = pd.Panel(dict(df1=change_old, df2=change_new))
diff_output = diff_panel.apply(report_diff, axis=0)
changes['duplicate'] = changes['XID'].isin(dupe_names)
removed_names = changes[(changes['duplicate'] == False) & (changes['version'] == 'old')]
removed_names.set_index('XID', inplace=True)
new_name_set = full_set.drop_duplicates(subset=cols_to_show)
new_name_set['duplicate'] = new_name_set['XID'].isin(dupe_names)
added_names = new_name_set[(new_name_set['duplicate'] == False) & (new_name_set['version'] == 'new')]
added_names.set_index('XID', inplace=True)
print(added_names)
df = pd.concat([diff_output, removed_names, added_names], keys=('changed', 'removed', 'added'))
print(df)
df[cols_to_show].to_csv(file3)