Question

我试图为文本中的单词构建我的第一个迭代器：

def words(text):
 regex = re.compile(r"""(\w(?:[\w']*\w)?|\S)""", re.VERBOSE)
 for line in text:
         words = regex.findall(line)
         if words:
                 for word in words:
                         yield word

如果我只使用这一行words = regex.findall(line)我检索一个包含所有单词的列表，但是如果我使用该函数并执行NEXT（），它将逐字符地返回文本。

知道我做错了吗？

Answer 1

我相信您将字符串传递给文本，因为这是导致所有字符的唯一方式。所以，鉴于此，我更新了代码以容纳一个字符串（我所做的就是删除其中一个循环）：导入重新

import re

def words(text):
    regex = re.compile(r"""(\w(?:[\w']*\w)?|\S)""", re.VERBOSE)
    words = regex.findall(text)
    for word in words:
        yield word

print(list(words("I like to test strings")))

Answer 2

text是一个字符串列表吗？如果它在字符串上（即使包含新行），它会解释结果......

单词的迭代器返回字符

2 个答案: