我觉得这个Python代码可以大大缩短,但我几乎总是倾向于回归编写C风格的布局。在您看来,缩短它的最佳方法是什么?可读性是一个奖励,而非要求。
def compfiles(file1, file2):
linecnt = 0
for line1 in open(file1):
line1 = line1.strip()
hit = False
for line2 in open(file2):
line2 = line2.strip()
if line2 == line1:
hit = True
break
if not hit:
print("Miss: file %s contains '%s', but file %s does not!" % (file1, line1, file2))
linecnt += 1
print("%i lines compared between %s and %s." % (linecnt, file1, file2))
fn = ["file1.txt", "file2.txt"]
compfiles(fn[0], fn[1])
compfiles(fn[1], fn[0])
答案 0 :(得分:2)
您的代码效率极低,因为您open
循环内的第二个文件迭代第一个文件。只需将第二个文件读入一个列表(或者更好的是set
,它会为您提供平均O(1)
查询时间)并使用in
运算符。此外,您的linecnt
变量只计算file1中的行数 - 您只需将行读入列表并在此列表中调用len
即可获得相同的数字:
def compfiles(file1, file2):
lines1 = [l.strip() for l in open(file1).read().split("\n")]
lines2 = set([l.strip() for l in open(file2).read().split("\n")])
for line in lines1:
if not line in lines2:
print("Miss: file %s contains '%s', but file %s does not!" % (file1, line, file2))
print("%i lines compared between %s and %s." % (len(lines1), file1, file2))
答案 1 :(得分:1)
def compfiles(file1, file2):
with open(file1) as fin:
set1 = set(fin)
with open(file2) as fin:
set2 = set(fin)
... # do some set operations
如果文件有重复的行或顺序很重要,请迭代file1
def compfiles(file1, file2):
with open(file2) as fin:
set2 = set(fin)
with open(file1) as fin:
for i, line in enumerate(fin):
if line not in set2:
print("Miss: file %s contains '%s', but file %s does not!" % (file1, line1, file2))
print("%i lines compared between %s and %s." % (i+1, file1, file2))