比较2个不同文件python中的字符串

时间:2015-10-13 13:13:18

标签: python string file compare difflib

我需要你的帮助,因为经过长时间的研究,我没有找到适合我的问题的答案。

我有2个文件,其中包含一些信息。其中一些信息与其他信息不同。 第一个文件是排序的,第二个文件不是。

我试图使用difflib但显然在我的情况下不起作用。

实施例

文件1:

customerID: aa
companyName: AA
contacts: AAAA AAAA <aa@aa.fr>

文件2:

customerID: zz
username: z.z
contacts: ZZZ ZZZ <zz@zz.com>

我需要查找customerID是否相同

这是我的代码:

import sys
import string
import difflib                                               

def changes(file1, file2):
    # opening the 2 files which we need to compare                                 
    master = open(file1, 'r')
    slave = open(file2, 'r')

    # searching diff                                                               
    diff = difflib.unified_diff(master.readlines(),slave.readlines())
    t = ''.join(diff)
    print (t)




def main(argv=2):
    print (sys.argv[1])
    print (sys.argv[2])
    if argv == 2:
        changes(sys.argv[1], sys.argv[2])
    else:
        print ("This program need 2 files")
        exit (0)
    return 0

    if __name__ == '__main__':
   status = main()
   sys.exit(status)

编辑:文件是我自己编写的文本。

2 个答案:

答案 0 :(得分:0)

with open('first.txt', 'r') as first_file:
   for line in first_file:
       data = line.split(":")
       if data[0].trim() == "customerID":
          customer_id  = data[1].trim()
          with open('second.txt', 'r') as second_file:
            for second_file_line in second_file:
            data2 = line.split(":")
            if data2[0].trim() == "customerID":
              if customer_id == data2[1].trim():
                <do your work>

如果您的文件太大,则在第二个文件中搜索

with open('second.txt', 'r') as second_file:
for line in second_file:
    if customer_id in line:
       <do your work>

或者如果文件足够小,那么

if customer_id in open('second.txt').read():
      <do your work>

答案 1 :(得分:0)

感谢大家的回复,它给了我很多帮助,我可以在这里找到解决方案:

def isInFile(l, f):
    with open(f, 'r') as f2:
        for line in f2:
            if l == line:
                return True
        return False

def similitudes(file1, file2):
    same = 0
    data = ''
    copy = False
    with open(file1, 'r') as f1:
        for line in f1:
            if copy == True:
               # data += line
                if line == '\n' or line[0:10] != 'customerID':
                    copy = False
            if (line[0:10] == 'customerID'):
                if isInFile(line, file2) == True:
                    copy = True
                    data += line
                else:
                    same += 1
    return data              

def changes(file1, file2):
    same = 0
    data = ''
    copy = False
    with open(file1, 'r') as f1:
        for line in f1:
            if copy == True:
                data += line
                if line == '\n' or line[0:10] != 'customerID' or line[0:8] != 'contacts':
                    copy = False
            if line[0:10] == 'customerID' or line[0:8] == 'contacts':
                if isInFile(line, file2) == False:
                    copy = True
                    data += line
                else:
                    same += 1
    return data

def main(argv=2):
    print (sys.argv[1])
    print (sys.argv[2])
    if argv == 2:
        out = open('differences.txt', 'w')
        data = (time.strftime('%d/%m/%y %H:%M')+'\n' +
                'FROM: ' + sys.argv[1] + '\n' +
                changes(sys.argv[1], sys.argv[2]) +
                'FROM: ' + sys.argv[2] + '\n' + 
                changes(sys.argv[2], sys.argv[1]) + '\n')
        out.write(data)
        out.close()

        out = open('similitudes.txt', 'w')
        data = (time.strftime('%d/%m/%y %H:%M\n')+
                'FROM: ' + sys.argv[1] + ' and ' + sys.argv[2] + '\n' +
                similitudes(sys.argv[1], sys.argv[2]) + '\n' + 
                'FROM: ' + sys.argv[2] + ' and ' + sys.argv[1] + '\n' +
                similitudes(sys.argv[2], sys.argv[1]))
        out.write(data)
        out.close()
    else:
        print ("This program need 2 files")
        exit (0)
    return 0

if __name__ == '__main__':
   status = main()
   sys.exit(status)