比较python中提取不同内容的两个文件的内容

时间:2013-05-15 16:25:02

标签: python

我有两个文件,file1包含内容为

  

aaa

     

bbb

     

CCC

和文件2包含内容

  

CCC

     

DDD

     

EEE

     

AAA

     

RRR

     

BBB

     

NNN

我想这样做,如果file2包含file1的行,那么该行将从file2中删除。最后,file2将作为ddd                                       EEE                                       存款准备金率                                       NNN 此外,我的代码是

f1 = open("test1.txt","r")
f2 = open("test2.txt","r")

    fileOne = f1.readlines()
    fileTwo = f2.readlines()
    f1.close()
    f2.close()
    outFile = open("test.txt","w")
    x = 0
    for i in fileOne:
        if i !=  fileTwo[x]:
            outFile.writelines(fileTwo[x])
        x += 1

outFile.close()

谢谢。

3 个答案:

答案 0 :(得分:4)

with open("f1.txt") as f1:
    s1 = set(f1)
with open("f2.txt") as f2, open("f3.txt","w") as f3:
    f3.writelines(x for x in f2 if x not in s1)

最好使用上下文管理器关闭文件(这就是with所做的事情。)

检查set的成员资格比list

更有效率

如果有可能有额外的空格,你应该删除这样的行

with open("f1.txt") as f1:
    s1 = set(x.strip() for x in f1)
with open("f2.txt") as f2, open("f3.txt","w") as f3:
    f3.writelines(x for x in f2 if x.strip() not in s1)

答案 1 :(得分:0)

使用set difference查找两个文件的差异。

f1 = open("test1.txt","r").readlines()
f2 = open("test2.txt","r").readlines()

diff = set(f2) - set(f1)
outFile = open("test.txt","w")
outFile.writelines(line for line in f2 if line in diff)

答案 2 :(得分:0)

使用您的代码......

f1 = open("test1.txt","r").read()
f2 = open("test2.txt","r").read()

fileOne = f1.splitlines()
fileTwo = f2.splitlines()

# remove the dup lines
nodup_lines = [line for line in fileTwo if line not in fileOne]
# join using newline character
newFileTwo = '\n'.join(nodup_lines)

# write file
outFile = open("test.txt","w")
outFile.write(newFileTwo)
outFile.close()