我如何在列表中找到一个二元组?例如,如果我想找到
bigram = list(nltk.bigrams("New York"))
在单词列表中
words = nltk.corpus.brown.words(fileids=["ca44"])
我试过了,
for t in bigram:
if t in words:
*do something*
以及
if bigram in words:
*do something*
答案 0 :(得分:2)
.bigrams()
将返回元组生成器。您应该首先将元组转换为字符串。例如:
bigram_strings = [''.join(t) for t in bigram]
然后你可以做
for t in bigram_strings:
if t in words:
*do something*
答案 1 :(得分:1)
你可以编写一个为你的单词列表生成bigrams的生成器:
def pairwise(iterable):
"""Iterate over pairs of an iterable."""
i = iter(iterable)
j = iter(iterable)
next(j)
yield from zip(i, j)
(例如,list(pairwise(["this", "is", "a", "test"]))
将返回[('this', 'is'), ('is', 'a'), ('a', 'test')]
。)
然后压缩它和.bigrams()
的结果:
for pair in pairwise(words):
for bigram in nltk.bigrams("New York"):
if bigram == pair:
pass # found