我正在学习Python,对于愚蠢的问题感到抱歉..
我有两个文件:
list.csv
john
mary
joanna
lucas
kate
db.csv
john^chief^portland
mary^secretary^ny
joanna^supervisor^washington
我想要实现的是比较两个文件和输出
按字母顺序排序的第一列abd,其名称不在db中,在第二列中添加None
,如下所示:
output.csv
joanna^supervisor^washington
john^chief^portland
kate^None
lucas^None
Mary^secretary^ny
我从这个代码开始与它斗争,我在SO上找到了:
masterlist = list(reader22)
for hosts_row in reader21:
row = 1
found = False
for master_row in masterlist:
results_row = hosts_row
if hosts_row[0] == master_row[0]:
results_row.append('FOUNDTHISLINE in master list (row '
+ str(row) + ')')
found = True
break
row = row + 1
if not found:
results_row.append('THISLINENOTFOUND in master list')
writer23.writerow(results_row)
请帮助理解如何以最佳方式完成。
答案 0 :(得分:2)
这是Pandas图书馆的完美案例。我知道你只是在学习,但要检查数据操作(请忽略编号:))
In [37]: list_df = pd.read_csv('list.csv', header=None)
In [38]: db_df = pd.read_csv('db.csv', sep='^', header=None)
In [51]: db_df
Out[51]:
0 1 2
0 john chief portland
1 mary secretary ny
2 joanna supervisor washington
In [48]: list_df
Out[48]:
0
0 john
1 mary
2 joanna
3 lucas
4 kate
In [52]: df = list_df.merge(db_df, how='left')
In [53]: df
Out[53]:
0 1 2
0 john chief portland
1 mary secretary ny
2 joanna supervisor washington
3 lucas NaN NaN
4 kate NaN NaN
In [54]: df.sort(0)
Out[54]:
0 1 2
2 joanna supervisor washington
0 john chief portland
4 kate NaN NaN
3 lucas NaN NaN
1 mary secretary ny
从那里你可以调用df.to_csv函数并获得你正在寻找的输出。
(回写) http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_csv.html
答案 1 :(得分:2)
只使用csv
模块和Python自己的内置数据结构(例如列表和词典)来执行您想要的操作非常简单有效:
import csv
with open('list.csv', 'rb') as csvfile:
masterlist = sorted(row[0] for row in csv.reader(csvfile))
with open('db.csv', 'rb') as csvfile:
db = {row[0]: row[1:] for row in csv.reader(csvfile, delimiter='^')}
with open('output.csv', 'wb') as csvfile:
writer = csv.writer(csvfile, delimiter='^')
for name in masterlist:
writer.writerow([name] + db[name] if name in db else [name, 'None', ''])
创建output.csv
的内容:
joanna^supervisor^washington
john^chief^portland
kate^None^
lucas^None^
mary^secretary^ny