我需要你的帮助,因为经过长时间的研究,我没有找到适合我的问题的答案。
我有2个文件,其中包含一些信息。其中一些信息与其他信息不同。 第一个文件是排序的,第二个文件不是。
我试图使用difflib但显然在我的情况下不起作用。
实施例
文件1:
customerID: aa
companyName: AA
contacts: AAAA AAAA <aa@aa.fr>
文件2:
customerID: zz
username: z.z
contacts: ZZZ ZZZ <zz@zz.com>
我需要查找customerID是否相同
这是我的代码:
import sys
import string
import difflib
def changes(file1, file2):
# opening the 2 files which we need to compare
master = open(file1, 'r')
slave = open(file2, 'r')
# searching diff
diff = difflib.unified_diff(master.readlines(),slave.readlines())
t = ''.join(diff)
print (t)
def main(argv=2):
print (sys.argv[1])
print (sys.argv[2])
if argv == 2:
changes(sys.argv[1], sys.argv[2])
else:
print ("This program need 2 files")
exit (0)
return 0
if __name__ == '__main__':
status = main()
sys.exit(status)
编辑:文件是我自己编写的文本。
答案 0 :(得分:0)
with open('first.txt', 'r') as first_file:
for line in first_file:
data = line.split(":")
if data[0].trim() == "customerID":
customer_id = data[1].trim()
with open('second.txt', 'r') as second_file:
for second_file_line in second_file:
data2 = line.split(":")
if data2[0].trim() == "customerID":
if customer_id == data2[1].trim():
<do your work>
如果您的文件太大,则在第二个文件中搜索
with open('second.txt', 'r') as second_file:
for line in second_file:
if customer_id in line:
<do your work>
或者如果文件足够小,那么
if customer_id in open('second.txt').read():
<do your work>
答案 1 :(得分:0)
感谢大家的回复,它给了我很多帮助,我可以在这里找到解决方案:
def isInFile(l, f):
with open(f, 'r') as f2:
for line in f2:
if l == line:
return True
return False
def similitudes(file1, file2):
same = 0
data = ''
copy = False
with open(file1, 'r') as f1:
for line in f1:
if copy == True:
# data += line
if line == '\n' or line[0:10] != 'customerID':
copy = False
if (line[0:10] == 'customerID'):
if isInFile(line, file2) == True:
copy = True
data += line
else:
same += 1
return data
def changes(file1, file2):
same = 0
data = ''
copy = False
with open(file1, 'r') as f1:
for line in f1:
if copy == True:
data += line
if line == '\n' or line[0:10] != 'customerID' or line[0:8] != 'contacts':
copy = False
if line[0:10] == 'customerID' or line[0:8] == 'contacts':
if isInFile(line, file2) == False:
copy = True
data += line
else:
same += 1
return data
def main(argv=2):
print (sys.argv[1])
print (sys.argv[2])
if argv == 2:
out = open('differences.txt', 'w')
data = (time.strftime('%d/%m/%y %H:%M')+'\n' +
'FROM: ' + sys.argv[1] + '\n' +
changes(sys.argv[1], sys.argv[2]) +
'FROM: ' + sys.argv[2] + '\n' +
changes(sys.argv[2], sys.argv[1]) + '\n')
out.write(data)
out.close()
out = open('similitudes.txt', 'w')
data = (time.strftime('%d/%m/%y %H:%M\n')+
'FROM: ' + sys.argv[1] + ' and ' + sys.argv[2] + '\n' +
similitudes(sys.argv[1], sys.argv[2]) + '\n' +
'FROM: ' + sys.argv[2] + ' and ' + sys.argv[1] + '\n' +
similitudes(sys.argv[2], sys.argv[1]))
out.write(data)
out.close()
else:
print ("This program need 2 files")
exit (0)
return 0
if __name__ == '__main__':
status = main()
sys.exit(status)