Question

我在Python中有以下脚本，用于查找包含两个或更多元音的单词，并将结果输出到txt文件。脚本当前正在运行，但输出文件为空。我尝试了几种不同的方法无济于事，知道为什么输出文件是空白的？我正在使用（重新）导入将输入视为正则表达式。

#!C:\Python33\python.exe

import re

file = open("Text of Steve Jobs' Commencement address (2005).htm");
output = open('twoVoweledWordList.txt', 'w');

for word in file.read():
   if len(re.findall('[aeiouy]', word)) >= 2:
      match == True;
      while True :
        output.write(word, '\n');

        file.close()
        output.close()

Answer 1

你要求一个更好的方法一次读一个单词。你走了：

with open(input_file_name, "rt") as f:
    for line in f:
        for word in line.split():
            # do something with each word here

评论：

一般来说，我尽量避免使用内置的Python功能作为变量名。由于file是Python 2.x中内置的，因此语法着色文本编辑器会将其标记为不同的颜色......也可以只使用f作为变量名称。
最好使用with语句。非常清楚，在所有版本的Python中，它确保您的文件在完成后正确关闭。（这里没关系，但这确实是最好的做法。）
open()返回一个可以在for循环中使用的对象。您将一次从文件中获得一行输入。
line.split()使用任何“空格”（空格，制表符等）将行拆分为单词

我不知道您是否已经看过生成器函数，但是您可以将上面的双嵌套for循环包装到这样的生成器函数中：

def words(f):
    for line in f:
        for word in line.split():
            yield word

with open(input_file_name, "rt") as f:
    for word in words(f):
        # do something with word

我喜欢隐藏这样的机器。如果你需要使分词更复杂，那么复杂的部分就会与实际处理单词的部分很好地分开。

Answer 2

使用with语句时，您不必担心明确关闭文件。我相信y不是元音。所以，我从答案中删除了它。

import re

with open("Input.txt") as inputFile, open("Output.txt", "w") as output:
    for line in inputFile:
        for word in line.split():
            if len(re.findall('[aeiou]', word)) >= 2:
                output.write(word + '\n')

Answer 3

虽然史蒂夫说得很好，但万一你更喜欢循环： -

import re

file = open("Text of Steve Jobs' Commencement address (2005).htm")
output = open('twoVoweledWordList.txt', 'w')

for line in file:
    for word in line.split():
       if len(re.findall('[aeiouy]', word)) >= 2:
          output.write(word + '\n')

将Python Script的结果写入txt文件

3 个答案: