对文件行进行迭代失败

时间:2018-03-26 19:03:36

标签: python file iteration

我写了这个脚本来查找所有未遵循的行:(currentLine +“已下载”)

with open('filename') as f:
    for line in f:
        if not line.strip('\n') + " downloaded\n" in f:
            print line

它只打印第一个匹配并结束迭代。 例如,对于此文件:

WhatsApp
Windows Live Mail
Windows Live Mail downloaded
XChat
XChat downloaded
Zimbra TGZ to PST Converter
Zimbra TGZ to PST Converter downloaded
ZOOK DBX to EML Converter
ZOOK DBX to EML Converter downloaded
ZOOK DBX to EMLX Converter
ZOOK DBX to EMLX Converter downloaded
ZOOK DBX to MBOX Converter
ZOOK DBX to MBOX Converter downloaded
ZOOK DBX to MSG Converter

我希望它能打印WhatsAppZOOK DBX to MSG Converter。但它只打印第一场比赛WhatsApp

1 个答案:

答案 0 :(得分:1)

if not line.strip('\n') + " downloaded\n" in f:默默地在f上进行迭代,因此它会在第一次传递时使用该列表。

您必须使用list创建f = list(f),以便消除疲惫效果,但这不符合性能(O(n)查询)

我会创建2个set()个对象,一个用于项目,另一个用于"已下载"项目:

downloaded = set()
products = set()

with open('filename') as f:    
    for line in f:
        if line.endswith("downloaded\n"):
            # store the info minus the "downloaded" prefix
            downloaded.add(line.replace(" downloaded\n",""))
        else:
            products.add(line.rstrip())

print(products - downloaded)

最后,只需打印产品减去下载的产品。打印:

{'ZOOK DBX to MSG Converter', 'WhatsApp'}

即使线路没有正确排序,此解决方案也能正常工作。