Question

我刚开始使用nltk和python，在迭代nltk返回的bigrams列表时遇到一个小问题。

我想要的例子：

这是双字母列表： [（＆＃39;更多＆＃39;，＆＃39;是＆＃39;），（＆＃39;是＆＃39;，＆＃39;说＆＃39;），（＆＃39;说＆＃39; ，＆＃39;比＆＃39;），（＆＃39;比＆＃39;，＆＃39;完成＆＃39;）]

我想要的是能够获得每个二元组:(更多，是）和每个二元组的每个术语：更多，分别是等等

这是我到目前为止所做的，基于stackoverflow中的一些答案：

bigrams = nltk.bigrams(doclist)

#method 1   
for (a, b) in bigrams: #I get this error:  TypeError: 'NoneType' object is not iterable
    print a 
    print b 

#method 2
#convert to a list first 
bigrams = list(bigrams)# I get the same error
for (a, b) in bigrams:
    print a 
    print b

#method 3
#convert to a dict first
dct = dict(tuples)# I get the same error

我认为这个bigrams是一个元组列表，所以我做错了什么？

您能否指出我的任何工作代码或教程。我也很乐意接受任何正确答案。

提前谢谢

注意：我正在使用python 2.7

Answer 1

要在元组内部进行迭代，你只需要使用变量（bigram索引的数量）而不是这样的元组：（for (a, b) in bigrams），如果你只想要每个bigram使用循环中ONE variable：

为了更好地理解，请参阅下面的演示：

>>> bigrams=[('more', 'is'), ('is', 'said'), ('said', 'than'), ('than', 'done')]
>>> for a, b in bigrams: 
...     print a 
...     print b 
... 
more
is
is
said
said
than
than
done
>>> for a in bigrams:
...  print a
... 
('more', 'is')
('is', 'said')
('said', 'than')
('than', 'done')
>>>

迭代由nltk给出的bigrams（元组）：TypeError：'NoneType'对象不可迭代，Python

1 个答案: