我正在阅读一个包含大约13,000个名字的文件到列表中。
然后,我查看该列表中每个项目的每个字符,如果匹配,我会从13,000列表中删除该行。
如果我运行一次,它会删除大约一半的列表。在第11次运行中,似乎将其降至9%。为什么这个脚本缺少结果?为什么它连续运行会抓住它们?
使用Python 3.
with open(fname) as f:
lines = f.read().splitlines()
bad_letters = ['B', 'C', 'F', 'G', 'H', 'J', 'L', 'O', 'P', 'Q', 'U', 'W', 'X']
def clean(callsigns, bad):
removeline = 0
for line in callsigns:
for character in line:
if character in bad:
removeline = 1
if removeline == 1:
lines.remove(line)
removeline = 0
return callsigns
for x in range (0, 11):
lines = clean(lines, bad_letters)
print (len(lines))
答案 0 :(得分:3)
当你在lines
数组上循环(即迭代)时,你正在改变(即,改变)。这绝不是一个好主意,因为它意味着你在阅读时正在改变某些东西,这会导致你跳过线条而不是在第一次删除它们。
有很多方法可以解决这个问题。在下面的示例中,我们跟踪要删除的行,并在某个单独的循环中删除它们,以便索引不会更改。
with open(fname) as f:
lines = f.read().splitlines()
bad_letters = ['B', 'C', 'F', 'G', 'H', 'J', 'L', 'O', 'P', 'Q', 'U', 'W', 'X']
def clean(callsigns, bad):
removeline = 0
to_remove = []
for line_i, line in enumerate(callsigns):
for b in bad:
if b in line:
# We're removing this line, take note of it.
to_remove.append(line_i)
break
# Remove the lines in a second step. Reverse it so the indices don't change.
for r in reversed(to_remove):
del callsigns[r]
return callsigns
for x in range (0, 11):
lines = clean(lines, bad_letters)
答案 1 :(得分:1)
将您要保留的名称保存在单独的列表中。也许这样: -
with open(fname) as f:
lines = f.read().splitlines()
bad_letters = ['B', 'C', 'F', 'G', 'H', 'J', 'L', 'O', 'P', 'Q', 'U', 'W', 'X']
def clean(callsigns, bad):
valid = [i for i in callsigns if not any(j in i for j in bad)]
return valid
valid_names = clean(lines,bad_letters)
print (len(valid_names))