我有两个文件,我需要这两个文件中不同的行。 这两行文件中的行不一致。
我试图使用以下脚本
file1 = open("test1.txt","r")
file2 = open("test2.txt","r")
lines1 = hosts0.readlines()
for i,lines2 in enumerate(file2):
if lines2 != lines1[i]:
print ("line ", i, " in File2 is different \n")
print (lines2)
else:
print ("Its similar")
但是,这仅比较两个文件中相同行号的行。
我的档案示例:
File1:
User 1 is Sam and PC in VLAN Trust
User10 is Tom and PC in VLAN Sales
Harry is User 6 and in VLAN Fin
File2:
Harry is User 6 and in VLAN Fin
User 1 is Sam and PC in VLAN Trust
User10 is Tom and PC in VLAN Sales
User20 is Donald and VLAN is Trust
我希望输出告诉我File1中存在的缺失行。只要两个文件之间的任何行都是通用的,不管行号不同,就不应该将它列为差异。
答案 0 :(得分:1)
with open('file1.txt','r') as f: lines1=f.readlines()
with open('file2.txt','r') as f: lines2=f.readlines()
diff=False
for line,idx in zip(lines2,range(len(lines2))):
if line not in lines1:
print("line %d of file2 is missing in file1:\n%s"%(idx,line))
diff=True
if not diff:
print("similar")
答案 1 :(得分:1)
打开文件,读取行。
然后遍历这些行并将file2中的每一行与file1中的行进行比较。如果该行同时存在,则变量inboth
变为真。
我在最后添加了一个打印命令,所以我可以检查它是否有效。只需更改变量名称以适合您使用的变量名称,然后将其添加到当前程序中。希望这是一个帮助
f1 = open("file1.txt","r")
f2 = open("file2.txt","r")
lines1 = f1.readlines()
lines2 = f2.readlines()
for i in lines2:
inboth = False
for x in lines1:
if i == x:
inboth = True
if inboth != True:
print("The line: \n",i,"\nis in file 2 but not file 1\n")
答案 2 :(得分:1)
您可以尝试这样的事情:
file1 = open("test1.txt","r")
file2 = open("test2.txt","r")
lines1 = file1.readlines()
lines2 = file2.readlines()
for i, line in enumerate(lines2):
if line not in lines1:
print("Line {} in file 2 is not in file 1".format(i))
for i, line in enumerate(lines1):
if line not in lines2:
print("Line {} in file 1 is not in file 2".format(i))
file1.close()
file2.close()
这适用于这两个文件。行数从零开始。您可以通过在格式参数中编写i+1
来修复它。还记得在脚本使用完毕后关闭文件。
答案 3 :(得分:1)
您最好的选择是使用difflib这是python中的内置模块。这是一个例子:
import difflib
file1_lines = [
'User 1 is Sam and PC in VLAN Trust',
'User10 is Tom and PC in VLAN Sales',
'Harry is User 6 and in VLAN Fin'
]
file2_lines = [
'Harry is User 6 and in VLAN Fin',
'User 1 is Sam and PC in VLAN Trust',
'User10 is Tom and PC in VLAN Sales',
'User20 is Donald and VLAN is Trus'
]
differ = difflib.Differ()
diffs = list(differ.compare(file1_lines, file2_lines))
for diff in diffs:
print(diff)
输出:
+ Harry is User 6 and in VLAN Fin
User 1 is Sam and PC in VLAN Trust
User10 is Tom and PC in VLAN Sales
- Harry is User 6 and in VLAN Fin
+ User20 is Donald and VLAN is Trus
从docs for Differ
开始,这些最初的双字母代码的含义是:
'- '
对序列1唯一的行- 两个序列共有的
'+ '
对序列2唯一的行- 输入序列中
' '
行'? '
行不存在
这里"序列1"是differ.compare()
和"序列2"的第一个参数。是第二个,它们都是要比较的字符串列表。
我更容易理解:
'+ '
开头的行是file2_lines
中添加的file1_lines
'- '
开头的行是file2_lines
中不存在但file1_lines
'? '
开头的行是那些已更改的行(up to a certain threshold)' '
开头的行是未在两组行之间修改的行修改强>
我看到在我的输出中,行Harry is user...
未显示为未更改。如果我现在正确理解它,你希望它显示为不变。您可以通过首先排序字符串列表然后比较排序列表来解决这个问题。只需使用以下内容调用compare
即可更改该行:
diffs = list(differ.compare(sorted(file1_lines), sorted(file2_lines)))