我有这段代码应该从列表中删除所有少于4个字符的单词,但是它只是删除了一些单词(我不确定是哪个单词),但绝对不是全部:
#load in the words from the original text file
def load_words():
with open('words_alpha.txt') as word_file:
valid_words = [word_file.read().split()]
return valid_words
english_words = load_words()
print("loading...")
print(len(english_words[0]))
#remove words under 4 letters
for word in english_words[0]:
if len(word) < 4:
english_words[0].remove(word)
print("done")
print(len(english_words[0]))
#save the remaining words to a new text file
new_words = open("english_words_v3.txt","w")
for word in english_words[0]:
new_words.write(word)
new_words.write("\n")
new_words.close()
它输出以下内容:
loading...
370103
done
367945
在words_alpha.txt中,有67000个英语单词。
答案 0 :(得分:0)
尝试使用list comprehensions:
{
"data": [
{
"access_token": "some_big_string",
"category": "Health/Beauty",
"category_list": [
{
"id": "2214",
"name": "Health/Beauty"
}
],
"name": "Page_Name",
"id": "5648645556490",
"tasks": [
"ANALYZE",
"ADVERTISE",
"MODERATE",
"CREATE_CONTENT",
"MANAGE"
]
}
]
}
脚本中的问题是您要在迭代列表时修改列表。您还可以通过实例化和填充新列表来避免此问题,但是列表理解对于这种情况是理想的。
答案 1 :(得分:0)
您想使用english_words
来复制english_words[0][:]
的副本。现在,您要在要修改的同一列表上进行迭代,这会导致异常行为。所以for循环看起来像
for word in english_words[0][:]:
if len(word) < 4:
english_words[0].remove(word)
您还可以通过列表理解来简化第一个for循环,并且不需要将word_file.read().split()
包装在列表中,因为它已经返回了列表
所以您的代码看起来像
#load in the words from the original text file
def load_words():
with open('words_alpha.txt') as word_file:
#No need to wrap this into a list since it already returns a list
valid_words = word_file.read().split()
return valid_words
english_words = load_words()
#remove words under 4 letters using list comprehension
english_words = [word for word in english_words if len(word) >= 4]
print("done")
print(len(english_words))
#save the remaining words to a new text file
new_words = open("english_words_v3.txt","w")
for word in english_words:
new_words.write(word)
new_words.write("\n")
new_words.close()