Question

我正在阅读一个包含大约13,000个名字的文件到列表中。

然后，我查看该列表中每个项目的每个字符，如果匹配，我会从13,000列表中删除该行。

如果我运行一次，它会删除大约一半的列表。在第11次运行中，似乎将其降至9％。为什么这个脚本缺少结果？为什么它连续运行会抓住它们？

使用Python 3.

with open(fname) as f:
    lines = f.read().splitlines()

bad_letters = ['B', 'C', 'F', 'G', 'H', 'J', 'L', 'O', 'P', 'Q', 'U', 'W', 'X']

def clean(callsigns, bad):
    removeline = 0

    for line in callsigns:
        for character in line:
             if character in bad:
                 removeline = 1
        if removeline == 1:
            lines.remove(line)
            removeline = 0
    return callsigns

for x in range (0, 11):
    lines = clean(lines, bad_letters)   

print (len(lines))

Answer 1

当你在lines数组上循环（即迭代）时，你正在改变（即，改变）。这绝不是一个好主意，因为它意味着你在阅读时正在改变某些东西，这会导致你跳过线条而不是在第一次删除它们。

有很多方法可以解决这个问题。在下面的示例中，我们跟踪要删除的行，并在某个单独的循环中删除它们，以便索引不会更改。

with open(fname) as f:
    lines = f.read().splitlines()

bad_letters = ['B', 'C', 'F', 'G', 'H', 'J', 'L', 'O', 'P', 'Q', 'U', 'W', 'X']

def clean(callsigns, bad):
    removeline = 0
    to_remove = []
    for line_i, line in enumerate(callsigns):
      for b in bad:
        if b in line:
          # We're removing this line, take note of it.
          to_remove.append(line_i)
          break
    # Remove the lines in a second step. Reverse it so the indices don't change.
    for r in reversed(to_remove):
      del callsigns[r]

    return callsigns

for x in range (0, 11):
    lines = clean(lines, bad_letters)

Answer 2

将您要保留的名称保存在单独的列表中。也许这样： -

with open(fname) as f:
    lines = f.read().splitlines()

bad_letters = ['B', 'C', 'F', 'G', 'H', 'J', 'L', 'O', 'P', 'Q', 'U', 'W', 'X']

def clean(callsigns, bad):
    valid = [i for i in callsigns if not any(j in i for j in bad)]
    return valid

valid_names = clean(lines,bad_letters)

print (len(valid_names))

Python循环缺少结果

2 个答案: