我试图通过比较2个csv文件之间的列来打印出差异。
CSV1:
SERVER, FQDN, IP_ADDRESS,
serverA, device1.com, 10.10.10.1
serverA,device2.com,10.11.11.1
serverC,device3.com,10.12.12.1
and so on..
CSV2:
FQDN, IP_ADDRESS, SERVER, LOCATION
device3.com,10.12.12.1,serverC,xx
device679.com,20.3.67.1,serverA,we
device1.com,10.10.10.1,serverA,ac
device345.com,192.168.2.0,serverA,ad
device2.com,192.168.6.0,serverB,af
and so on...
我要做的是比较FQDN列并将差异写入新的csv输出文件。所以我的输出看起来像这样:
Output.csv:
FQDN, IP_ADDRESS, SERVER, LOCATION
device679.com,20.3.67.1,serverA,we
device345.com,192.168.2.0,serverA,ad
and so on..
我试过了,但无法获得输出。
这是我的代码,请告诉我哪里出错了;
import csv
data = {} # creating list to store the data
with open('CSV1.csv', 'r') as lookuplist:
reader1 = csv.reader(lookuplist)
for col in reader1:
DATA[col[0]] = col[1]
with open('CSV2.csv', 'r') as csvinput, open('Output.csv', 'w', newline='') as f_output:
reader2 = csv.reader(csvinput)
csv_output = csv.writer(f_output)
fieldnames = (['FQDN', 'IP_ADDRESS', 'SERVER'])
csv_output.writerow(fieldnames) # prints header to the output file
for col in reader1:
if col[1] not in reader2:
csv_output.writerow(col)
(编辑)这是我使用的另一种方法:
import csv
f1 = (open("CSV1.csv"))
f2 = (open("CSV2.csv"))
csv_f1 = csv.reader(f1)
csv_f2 = csv.reader(f2)
for col1, col2 in zip(csv_f1, csv_f2):
if col2[0] not in col1[1]:
print(col2[0])
基本上,我在这里只是想先找出是否打印了不匹配的FQDN。但它打印出整个CSV1列。请帮助大家,很多研究已经进入了这个,但发现没有运气! :(
答案 0 :(得分:0)
此代码使用内置difflib吐出file1.csv
中file2.csv
未出现的行,反之亦然。
我使用Differ
对象来识别换行。
我假设您不会将换行视为差异,这就是我添加sorted()
函数调用的原因。
from difflib import Differ
csv_file1 = sorted(open("file1.csv", 'r').readlines())
csv_file2 = sorted(open("file2.csv", 'r').readlines())
with open("diff.csv", 'w') as f:
for line in Differ().compare(csv_file1,csv_file2)):
dmode, line = line[:2], line[2:]
if dmode.strip() == "":
continue
f.write(line + "\n")
请注意,如果该行有所不同(不仅在FQDN
列中),它将显示在diff.csv
答案 1 :(得分:0)
import csv
data = {} # creating list to store the data
with open('CSV1.csv', 'r') as lookuplist, open('CSV2.csv', 'r') as csvinput, open('Output.csv', 'w') as f_output:
reader1 = csv.reader(lookuplist)
reader2 = csv.reader(csvinput)
csv_output = csv.writer(f_output)
fieldnames = (['FQDN', 'IP_ADDRESS', 'SERVER', 'LOCATION'])
csv_output.writerow(fieldnames) # prints header to the output file
_tempFqdn = []
for i,dt in enumerate(reader1):
if i==0:
continue
_tempFqdn.append(dt[1].strip())
for i,col in enumerate(reader2):
if i==0:
continue
if col[0].strip() not in _tempFqdn:
csv_output.writerow(col)
答案 2 :(得分:-1)
import csv
data = {} # creating dictionary to store the data
with open('CSV1.csv', 'r') as lookuplist:
reader1 = csv.reader(lookuplist)
for col in reader1:
data[col[1]] = col[1] # stores the data from column 0 to column 1 in the data list
with open('CSV2.csv', 'r') as csvinput, open('Output.csv', 'w', newline='') as f_output:
reader2 = csv.reader(csvinput)
csv_output = csv.writer(f_output)
fieldnames = (['SERVER', 'FQDN', 'AUTOMATION_ADMINISTRATOR', 'IP_ADDRESS', 'PRIMARY_1', 'MHT_1', 'MHT_2',
'MHT_3'])
csv_output.writerow(fieldnames) # prints header to the output file
for col in reader2:
if col[0] not in data: # if the column 1 in CSV1 does not match with column 0 in CSV2 Extract
col = [col[0]]
csv_output.writerow(col) # writes all the data that is matched in CMDB WLC Extract
所以基本上,我只需要改变'不要在'在' for循环'并更改将从我正在创建的CSV1文件中读取的数据列表中的列。