Question

import re
twovowels=re.compile(r".*[aeiou].*[aeiou].*", re.I)
nonword=re.compile(r"\W+", re.U)
text_file = open("twoVoweledWordList.txt", "w")
file = open("FirstMondayArticle.html","r")
for line in file:
    for word in nonword.split(line):
        if twovowels.match(word): print word
        text_file.write('\n' + word)
text_file.close()

file.close()

这是我的python代码，我试图只打印有两个或更多元音的单词。当我运行此代码时，它会将所有内容（包括没有元音的单词和数字）打印到我的文本文件中。但是python shell向我显示了所有包含两个或更多元音的单词。那么我该如何改变呢？

Answer 1

您可以使用str.translate删除元音并比较长度。如果在移除字母后，长度差异> 1你至少有两个元音：

with open("FirstMondayArticle.html") as f, open("twoVoweledWordList.txt", "w") as out:
    for line in file:
        for word in line.split():
            if len(word) - len(word.lower().translate(None,"aeiou")) > 1:
                out.write("{}\n".format(word.rstrip()))

在您自己的代码中，您总是将单词写为text_file.write('\n' + word)在if块之外。一个很好的教训，为什么你不应该在一行上有多个语句，你的代码相当于：

   if twovowels.match(word):
        print(word)
    text_file.write('\n' + word) # <- outside the if

你的代码if在正确的位置，你的命名约定有些变化，在任务之间添加一些空格，并使用with为你关闭文件：

import re
with open("FirstMondayArticle.html") as f, open("twoVoweledWordList.txt", "w") as out:
    two_vowels = re.compile(r".*[aeiou].*[aeiou].*", re.I)
    non_word = re.compile(r"\W+", re.U)
    for line in f:
        for word in non_word.split(line):
            if two_vowels.match(word):
                print(word)
                out.write("{}\n".format(word.rstrip()))

Answer 2

因为它超出if条件。这就是代码行应该是这样的：

for line in file:
    for word in nonword.split(line):
        if twovowels.match(word):
            print word
            text_file.write('\n' + word)
text_file.close()

file.close()

这是sample program on Tutorialspoint，显示上面的代码是正确的。

Answer 3

我建议使用另一种更简单的方法，而不是使用re：

def twovowels(word):
    count = 0
    for char in word.lower():
        if char in "aeiou":
            count = count + 1
            if count > 1:
                return True
    return False

with open("FirstMondayArticle.html") as file,
        open("twoVoweledWordList.txt", "w") as text_file:
    for line in file:
        for word in line.split():
            if twovowels(word):
                print word
                text_file.write(word + "\n")

尝试使用仅包含两个或更多元音的单词打印到文本文件

3 个答案: