我目前正在尝试比较两个CSV文件,以检查file1.csv的第一列中的IP地址是否使用Python 3.6在file2.csv中的一行中。如果地址在file2中,我需要将该行的第二列值复制到与文件1相同的新文件中。两个文件设置如下所示:
文件1:
XX.XXX.XXX.1,Test1
XX.XXX.XXX.2,Test2
XX.XXX.XXX.3,Test3
XX.XXX.XXX.4,Test4
XX.XXX.XXX.5,Test5
XX.XXX.XXX.6,Test6
XX.XXX.XXX.7,Test7
XX.XXX.XXX.8,Test8
and so on
文件2:
XX.XXX.XXX.6, Name6
XX.XXX.XXX.7, Name7
XX.XXX.XXX.8, Name8
我需要将result.csv文件看起来像这样:
XX.XXX.XXX.1,Test1, Not found
XX.XXX.XXX.2,Test2, Not found
XX.XXX.XXX.3,Test3, Not found
XX.XXX.XXX.4,Test4, Not found
XX.XXX.XXX.5,Test5, Not found
XX.XXX.XXX.6,Test6,Name6
XX.XXX.XXX.7,Test7,Name7
XX.XXX.XXX.8,Test8,Name8
我到目前为止的代码如下:
import csv
f1 = open('file1.csv', 'r')
f2 = open('file2.csv', 'r')
f3 = open('results.csv', 'w')
c1 = csv.reader(f1)
c2 = csv.reader(f2)
c3 = csv.writer(f3)
file2 = list(c2)
for file1_row in c1:
row = 1
found = False
for file2_row in file2:
results_row = file1_row
x = file2_row[3]
if file1_row[1] == file2_row[1]:
results_row.append('Found. Name: ' + x)
found = True
break
row += 1
if not found:
results_row.append('Not found in File1')
c3.writerow(results_row)
f1.close()
f2.close()
f3.close()
现在这段代码正在检查相同的行而不是值。这意味着它不匹配任何东西,因为它要求IP列和相邻列在两个文件上都相同,此外它匹配文件的第1行,第2行,第3行等等,但我需要它来搜索一个在另一个中找到匹配项,而不是按索引比较行。
答案 0 :(得分:1)
熊猫解决方案:
import pandas as pd
df1 = pd.read_csv('file_1.csv', names=['a', 'b'])
df2 = pd.read_csv('file_2.csv', names=['a', 'b'])
merged = pd.merge(df1, df2, on='a', how='outer')
merged.to_csv('results.csv', header=False, index=False, na_rep='Not found')
results.csv
的内容:
XX.XXX.XXX.1,Test1,Not found
XX.XXX.XXX.2,Test2,Not found
XX.XXX.XXX.3,Test3,Not found
XX.XXX.XXX.4,Test4,Not found
XX.XXX.XXX.5,Test5,Not found
XX.XXX.XXX.6,Test6, Name6
XX.XXX.XXX.7,Test7, Name7
XX.XXX.XXX.8,Test8, Name8
答案 1 :(得分:0)
我移动了results_row的位置并在行+ = 1
之后更改了缩进import csv
f1 = open('file1.csv', 'r')
f2 = open('file2.csv', 'r')
f3 = open('results.csv', 'w')
c1 = csv.reader(f1)
c2 = csv.reader(f2)
c3 = csv.writer(f3)
file2 = list(c2)
for file1_row in c1:
row = 1
found = False
results_row = file1_row #Moved out from nested loop
for file2_row in file2:
x = file2_row[1]
if file1_row[0] == file2_row[0]:
results_row.append(x)
found = True
break
row += 1
if not found:
results_row.append('Not found')
c3.writerow(results_row)
f1.close()
f2.close()
f3.close()
答案 2 :(得分:0)
您尝试过的解决方案如下:
with open('result.csv', 'w') as out:
with open('file1.csv', 'r') as f1, open('file2.csv', 'r') as f2:
f2_lines = [line for line in f2.readlines() if len(line) > 1]
f1_lines = [line for line in f1.readlines() if len(line) > 1]
for line in f1_lines:
val = 'Not found'
b = [line.split(',')[0].strip() in item for item in f2_lines]
if any(b):
val = f2_lines[b.index(True)].split(',')[1].strip()
out.write('{}, {}\n'.format(line.strip(), val))
<强>输出:强>
XX.XXX.XXX.1,Test1, Not found
XX.XXX.XXX.2,Test2, Not found
XX.XXX.XXX.3,Test3, Not found
XX.XXX.XXX.4,Test4, Not found
XX.XXX.XXX.5,Test5, Not found
XX.XXX.XXX.6,Test6, Name6
XX.XXX.XXX.7,Test7, Name7
XX.XXX.XXX.8,Test8, Name8
答案 3 :(得分:0)
这是一个非熊猫的解决方案(假设您使用的是Python 3.x):
import csv
present = {}
with open('file2.csv', 'r', newline='') as file2:
reader = csv.reader(file2, skipinitialspace=True)
for ip, name in reader:
present[ip] = name
with open('file1.csv', 'r', newline='') as file1, \
open('results.csv', 'w', newline='') as results:
reader = csv.reader(file1, skipinitialspace=True)
writer = csv.writer(results)
for ip, name in reader:
writer.writerow([ip, name, present.get(ip, ' Not found')])
档案Results.csv
:
XX.XXX.XXX.1,Test1, Not found
XX.XXX.XXX.2,Test2, Not found
XX.XXX.XXX.3,Test3, Not found
XX.XXX.XXX.4,Test4, Not found
XX.XXX.XXX.5,Test5, Not found
XX.XXX.XXX.6,Test6,Name6
XX.XXX.XXX.7,Test7,Name7
XX.XXX.XXX.8,Test8,Name8