Question

我试图使用马尔可夫链构建一个随机句子生成器，但在尝试跟随文件中每个单词的单词构建列表时遇到了问题。我一直试图使用的代码是：

word_list = [spot+1 for spot in words if spot == word]

我尝试过各种变体，例如：

word_list = [words[spot+1] for spot in words if spot == word]

但每一次，我都会收到错误：

TypeError: Can't convert 'int' object to str implicitly

如何正确地将单词添加到跟随给定单词的列表中？我觉得这是一个明显的解决方案，我没想到。

Answer 1

诀窍是迭代对，而不是单个词：

words = ['the', 'enemy', 'of', 'my', 'enemy', 'is', 'my', 'friend']
word = 'my'

[next_word for this_word, next_word in zip(words, words[1:]) if this_word == word]

结果：

['enemy', 'friend']

这种方法依赖于Python的zip()函数和切片。

words[1:]是words的副本，错过了第一个：

>>> words[1:]
['enemy', 'of', 'my', 'enemy', 'is', 'my', 'friend']

...这样当你用它压缩原始words时，就会得到一对配对列表：

>>> list(zip(words, words[1:]))
[('The', 'enemy'),
 ('enemy', 'of'),
 ('of', 'my'),
 ('my', 'enemy'),
 ('enemy', 'is'),
 ('is', 'my'),
 ('my', 'friend')]

一旦你有了这个，你的列表理解只需要返回每对中的第二个单词，如果第一个单词是你正在寻找的那个：

word = 'enemy'

[next_word for this_word, next_word in zip(words, words[1:]) if this_word == word]

结果：

['of', 'is']

如何构建文件中每个单词后面的所有单词的列表？

1 个答案: