Question

我有这段代码应该从列表中删除所有少于4个字符的单词，但是它只是删除了一些单词（我不确定是哪个单词），但绝对不是全部：

#load in the words from the original text file
def load_words():
    with open('words_alpha.txt') as word_file:
        valid_words = [word_file.read().split()]

    return valid_words


english_words = load_words()
print("loading...")

print(len(english_words[0]))
#remove words under 4 letters
for word in english_words[0]:
    if len(word) < 4:
        english_words[0].remove(word)

print("done")
print(len(english_words[0]))

#save the remaining words to a new text file
new_words = open("english_words_v3.txt","w")
for word in english_words[0]:
    new_words.write(word)
    new_words.write("\n")

new_words.close()

它输出以下内容：

loading...
370103
done
367945

在words_alpha.txt中，有67000个英语单词。

Answer 1

尝试使用list comprehensions：

{
  "data": [
    {
        "access_token": "some_big_string",
        "category": "Health/Beauty",
        "category_list": [
            {
                "id": "2214",
                "name": "Health/Beauty"
            }
        ],
        "name": "Page_Name",
        "id": "5648645556490",
        "tasks": [
            "ANALYZE",
            "ADVERTISE",
            "MODERATE",
            "CREATE_CONTENT",
            "MANAGE"
        ]
    }
  ]
}

脚本中的问题是您要在迭代列表时修改列表。您还可以通过实例化和填充新列表来避免此问题，但是列表理解对于这种情况是理想的。

Answer 2

您想使用english_words来复制english_words[0][:]的副本。现在，您要在要修改的同一列表上进行迭代，这会导致异常行为。所以for循环看起来像

for word in english_words[0][:]:
    if len(word) < 4:
        english_words[0].remove(word)

您还可以通过列表理解来简化第一个for循环，并且不需要将word_file.read().split()包装在列表中，因为它已经返回了列表

所以您的代码看起来像

#load in the words from the original text file
def load_words():
    with open('words_alpha.txt') as word_file:
        #No need to wrap this into a list since it already returns a list
        valid_words = word_file.read().split()

    return valid_words

english_words = load_words()

#remove words under 4 letters using list comprehension
english_words = [word for word in english_words if len(word) >= 4]

print("done")
print(len(english_words))

#save the remaining words to a new text file
new_words = open("english_words_v3.txt","w")
for word in english_words:
    new_words.write(word)
    new_words.write("\n")

new_words.close()

我正在尝试从列表中删除所有长度在4个字符以下的单词，但是它不起作用

2 个答案: