我写了这个脚本来查找所有未遵循的行:(currentLine +“已下载”)
with open('filename') as f:
for line in f:
if not line.strip('\n') + " downloaded\n" in f:
print line
它只打印第一个匹配并结束迭代。 例如,对于此文件:
WhatsApp
Windows Live Mail
Windows Live Mail downloaded
XChat
XChat downloaded
Zimbra TGZ to PST Converter
Zimbra TGZ to PST Converter downloaded
ZOOK DBX to EML Converter
ZOOK DBX to EML Converter downloaded
ZOOK DBX to EMLX Converter
ZOOK DBX to EMLX Converter downloaded
ZOOK DBX to MBOX Converter
ZOOK DBX to MBOX Converter downloaded
ZOOK DBX to MSG Converter
我希望它能打印WhatsApp
和ZOOK DBX to MSG Converter
。但它只打印第一场比赛WhatsApp
。
答案 0 :(得分:1)
if not line.strip('\n') + " downloaded\n" in f:
默默地在f
上进行迭代,因此它会在第一次传递时使用该列表。
您必须使用list
创建f = list(f)
,以便消除疲惫效果,但这不符合性能(O(n)
查询)
我会创建2个set()
个对象,一个用于项目,另一个用于"已下载"项目:
downloaded = set()
products = set()
with open('filename') as f:
for line in f:
if line.endswith("downloaded\n"):
# store the info minus the "downloaded" prefix
downloaded.add(line.replace(" downloaded\n",""))
else:
products.add(line.rstrip())
print(products - downloaded)
最后,只需打印产品减去下载的产品。打印:
{'ZOOK DBX to MSG Converter', 'WhatsApp'}
即使线路没有正确排序,此解决方案也能正常工作。