Python nltk在结果中给了我句子的多​​个实例

时间:2014-06-19 14:48:33

标签: python nltk

这是我的代码

>>> from nltk.corpus import PlaintextCorpusReader
>>> corpus_root = 'C:/Python27/'
>>> wordlists = PlaintextCorpusReader(corpus_root,'amazonshoes.txt')
>>> sentences = wordlists.sents('amazonshoes.txt')
>>> words_to_find = 'amazon service'.split()
>>> for sentence in sentences:
...     if all(word in sentence for word in words_to_find):
...         print sentence

,结果是

['review', '/', 'text', ':', 'It', "'", 's', 'the', 'first', 'time', 'that', 'I', 'buy', 'something', 'over', 'amazon', 'and', 'I', 'have', 'to', 'say', 'I', 'am', 'very', 'impressed', 'with', 'the', 'service', 'and', 'the', 'quality', 'of', 'the', 'product', '.']
['review', '/', 'text', ':', 'It', "'", 's', 'the', 'first', 'time', 'that', 'I', 'buy', 'something', 'over', 'amazon', 'and', 'I', 'have', 'to', 'say', 'I', 'am', 'very', 'impressed', 'with', 'the', 'service', 'and', 'the', 'quality', 'of', 'the', 'product', '.']
['review', '/', 'text', ':', 'It', "'", 's', 'the', 'first', 'time', 'that', 'I', 'buy', 'something', 'over', 'amazon', 'and', 'I', 'have', 'to', 'say', 'I', 'am', 'very', 'impressed', 'with', 'the', 'service', 'and', 'the', 'quality', 'of', 'the', 'product', '.']
['review', '/', 'text', ':', 'It', "'", 's', 'the', 'first', 'time', 'that', 'I', 'buy', 'something', 'over', 'amazon', 'and', 'I', 'have', 'to', 'say', 'I', 'am', 'very', 'impressed', 'with', 'the', 'service', 'and', 'the', 'quality', 'of', 'the', 'product', '.']
['review', '/', 'text', ':', 'It', "'", 's', 'the', 'first', 'time', 'that', 'I', 'buy', 'something', 'over', 'amazon', 'and', 'I', 'have', 'to', 'say', 'I', 'am', 'very', 'impressed', 'with', 'the', 'service', 'and', 'the', 'quality', 'of', 'the', 'product', '.']

我应该在代码中更改什么。

1 个答案:

答案 0 :(得分:0)

检查您的sentences列表是否已包含重复项。或者改变你的代码:

>>> for sentence in set([tuple(s) for s in sentences]):
...     if all(word in sentence for word in words_to_find):
...         print sentence

此更改确保您拥有一组唯一的句子(更改会将您的列表转换为元组,因为列表不可清除)。