尝试使用仅包含两个或更多元音的单词打印到文本文件

时间:2015-03-25 19:58:23

标签: python regex

import re
twovowels=re.compile(r".*[aeiou].*[aeiou].*", re.I)
nonword=re.compile(r"\W+", re.U)
text_file = open("twoVoweledWordList.txt", "w")
file = open("FirstMondayArticle.html","r")
for line in file:
    for word in nonword.split(line):
        if twovowels.match(word): print word
        text_file.write('\n' + word)
text_file.close()

file.close()

这是我的python代码,我试图只打印有两个或更多元音的单词。当我运行此代码时,它会将所有内容(包括没有元音的单词和数字)打印到我的文本文件中。但是python shell向我显示了所有包含两个或更多元音的单词。那么我该如何改变呢?

3 个答案:

答案 0 :(得分:1)

您可以使用str.translate删除元音并比较长度。如果在移除字母后,长度差异> 1你至少有两个元音:

with open("FirstMondayArticle.html") as f, open("twoVoweledWordList.txt", "w") as out:
    for line in file:
        for word in line.split():
            if len(word) - len(word.lower().translate(None,"aeiou")) > 1:
                out.write("{}\n".format(word.rstrip()))

在您自己的代码中,您总是将单词写为text_file.write('\n' + word)在if块之外。一个很好的教训,为什么你不应该在一行上有多个语句,你的代码相当于:

   if twovowels.match(word):
        print(word)
    text_file.write('\n' + word) # <- outside the if

你的代码if在正确的位置,你的命名约定有些变化,在任务之间添加一些空格,并使用with为你关闭文件:

import re
with open("FirstMondayArticle.html") as f, open("twoVoweledWordList.txt", "w") as out:
    two_vowels = re.compile(r".*[aeiou].*[aeiou].*", re.I)
    non_word = re.compile(r"\W+", re.U)
    for line in f:
        for word in non_word.split(line):
            if two_vowels.match(word):
                print(word)
                out.write("{}\n".format(word.rstrip()))  

答案 1 :(得分:0)

因为它超出if条件。这就是代码行应该是这样的:

for line in file:
    for word in nonword.split(line):
        if twovowels.match(word):
            print word
            text_file.write('\n' + word)
text_file.close()

file.close()

这是sample program on Tutorialspoint,显示上面的代码是正确的。

答案 2 :(得分:0)

我建议使用另一种更简单的方法,而不是使用re

def twovowels(word):
    count = 0
    for char in word.lower():
        if char in "aeiou":
            count = count + 1
            if count > 1:
                return True
    return False

with open("FirstMondayArticle.html") as file,
        open("twoVoweledWordList.txt", "w") as text_file:
    for line in file:
        for word in line.split():
            if twovowels(word):
                print word
                text_file.write(word + "\n")