我有两个CSV文件,每个文件都有两列id
和name
。我想通过它们的name
列来比较这两个文件;如果值匹配,则使用两个文件中的id
值创建一个新的CSV文件。
1.csv:
id, name
1, sofia
2, Maria
3, sofia
4, Laura
2.csv:
id, name
1, sofia
2, Laura
我的代码:
import csv
with open('1.csv') as companies, open('2.csv') as tags:
companies = companies.readlines()
tags = tags.readlines()
with open('CompanieTags.csv', 'w') as outFile:
for line in companies:
if line[1] != tags[1]:
line2 = companies[1]
outFile.write(line[0] and linea2)
带Dict的其他代码
import csv
with open('1.csv') as companies, open('2.csv') as tags:
reader = csv.DictReader(companies)
check = csv.DictReader(tags)
with open('CompanieTags.csv', 'w') as outFile:
for x in check:
SaveTag = x['name']
for y in reader:
if SaveTag in y['name'] :
outFile.write(y['id'], x['id'])
预期结果:
id, name
1, 1
3, 1
4, 2
答案 0 :(得分:1)
这里
(我跳过了将文件中的数据加载到元组列表的操作-我认为您可以做到)
import csv
from itertools import cycle
lst1 = [(1, 'Jack'), (4, 'Ben'), (5, 'Sofi')]
lst2 = [(12, 'Jack'), (4, 'Jack'), (15, 'Jack')]
names1 = {x[1] for x in lst1}
names2 = {x[1] for x in lst2}
common = names1.intersection(names2)
common_in_1 = [x[0] for x in lst1 if x[1] in common]
common_in_2 = [x[0] for x in lst2 if x[1] in common]
result = zip(common_in_1, cycle(common_in_2)) if len(common_in_1) > len(common_in_2) else zip(cycle(common_in_1),
common_in_2)
print(list(result))
# write to output file
with open('out.csv', mode='w', newline='') as f:
writer = csv.writer(f)
writer.writerows(result)
输出
[(1, 12), (1, 4), (1, 15)]
答案 1 :(得分:0)
答案的另一个版本:
-不使用itertools
-加载csv文件
-在帖子中使用csv文件
import csv
NAME = 1
ID = 0
def load_csv(file_name):
res = []
with open(file_name) as f:
reader = csv.reader(f)
for idx, row in enumerate(reader):
if idx > 0:
res.append(row)
return res
lst1 = load_csv('1.csv')
lst2 = load_csv('2.csv')
result = []
for x in lst1:
for y in lst2:
if x[NAME] == y[NAME]:
result.append((x[ID], y[ID]))
print(result)
输出
[('1', '1'), ('3', '1'), ('4', '2')]