我有这个程序只需要两个文件并逐行比较。只要两个文件具有相同的行数,它就可以正常工作。我的问题是如果例如file2有多行而不是file1?或者相反。当发生这种情况时,我得到IndexError:list index超出范围的错误。我该怎么做才能考虑到这一点?
#Compares two files
def compare(baseline, newestFile):
baselineHolder = open(baseline)
newestFileHolder = open(newestFile)
lines1 = baselineHolder.readlines()
a = returnName(baseline)
b = returnName(newestFile)
for i,lines2 in enumerate(newestFileHolder):
if lines2 != lines1[i]:
add1 = i + 1
print ("line ", add1, " in newestFile is different \n")
print("TAKE A LOOK HERE----------------------TAKE A LOOK HERE")
print (lines2)
else:
addRow = 1 + i
print ("line " + str(addRow) + " is identical")
答案 0 :(得分:4)
为什么不使用内置的difflib
而不是重新发明轮子?以下是使用文档中的difflib.unified_diff
的示例:
>>> s1 = ['bacon\n', 'eggs\n', 'ham\n', 'guido\n'] >>> s2 = ['python\n', 'eggy\n', 'hamster\n', 'guido\n'] >>> for line in unified_diff(s1, s2, fromfile='before.py', tofile='after.py'): ... sys.stdout.write(line) --- before.py +++ after.py @@ -1,4 +1,4 @@ -bacon -eggs -ham +python +eggy +hamster guido
答案 1 :(得分:1)
也许您可以使用itertools.izip_longest
。如果一个序列已用尽,则会发出一些填充值(默认情况下为None
):
import itertools
for l, r in itertools.izip_longest(open('foo.txt'), open('bar.txt')):
if l is None: # foo.txt has been exhausted
...
elif r is None: # bar.txt has been exhausted
...
else: # both still have lines - compare now the content of l and r
...
编辑正如@danidee正确指出的那样,对于Py3,它是zip_longest
。
答案 2 :(得分:1)
您应该抓住IndexError
,然后停止比较
for i,lines2 in enumerate(newestFileHolder):
try:
if lines2 != lines1[i]:
add1 = i + 1
print ("line ", add1, " in newestFile is different \n")
print("TAKE A LOOK HERE----------------------TAKE A LOOK HERE")
print (lines2)
else:
addRow = 1 + i
print ("line " + str(addRow) + " is identical")
except IndexError:
print("Exit comparison")
break