import re
twovowels=re.compile(r".*[aeiou].*[aeiou].*", re.I)
nonword=re.compile(r"\W+", re.U)
text_file = open("twoVoweledWordList.txt", "w")
file = open("FirstMondayArticle.html","r")
for line in file:
for word in nonword.split(line):
if twovowels.match(word): print word
text_file.write('\n' + word)
text_file.close()
file.close()
这是我的python代码,我试图只打印有两个或更多元音的单词。当我运行此代码时,它会将所有内容(包括没有元音的单词和数字)打印到我的文本文件中。但是python shell向我显示了所有包含两个或更多元音的单词。那么我该如何改变呢?
答案 0 :(得分:1)
您可以使用str.translate删除元音并比较长度。如果在移除字母后,长度差异> 1你至少有两个元音:
with open("FirstMondayArticle.html") as f, open("twoVoweledWordList.txt", "w") as out:
for line in file:
for word in line.split():
if len(word) - len(word.lower().translate(None,"aeiou")) > 1:
out.write("{}\n".format(word.rstrip()))
在您自己的代码中,您总是将单词写为text_file.write('\n' + word)
在if块之外。一个很好的教训,为什么你不应该在一行上有多个语句,你的代码相当于:
if twovowels.match(word):
print(word)
text_file.write('\n' + word) # <- outside the if
你的代码if在正确的位置,你的命名约定有些变化,在任务之间添加一些空格,并使用with
为你关闭文件:
import re
with open("FirstMondayArticle.html") as f, open("twoVoweledWordList.txt", "w") as out:
two_vowels = re.compile(r".*[aeiou].*[aeiou].*", re.I)
non_word = re.compile(r"\W+", re.U)
for line in f:
for word in non_word.split(line):
if two_vowels.match(word):
print(word)
out.write("{}\n".format(word.rstrip()))
答案 1 :(得分:0)
因为它超出if
条件。这就是代码行应该是这样的:
for line in file:
for word in nonword.split(line):
if twovowels.match(word):
print word
text_file.write('\n' + word)
text_file.close()
file.close()
这是sample program on Tutorialspoint,显示上面的代码是正确的。
答案 2 :(得分:0)
我建议使用另一种更简单的方法,而不是使用re
:
def twovowels(word):
count = 0
for char in word.lower():
if char in "aeiou":
count = count + 1
if count > 1:
return True
return False
with open("FirstMondayArticle.html") as file,
open("twoVoweledWordList.txt", "w") as text_file:
for line in file:
for word in line.split():
if twovowels(word):
print word
text_file.write(word + "\n")