我正在开发一个关于文本分类的Python基础项目。 我正在使用nltk,我已经进口了它的布朗语料库。 在尝试将一个组分类为“正面”而将另一个组分类为“负面”时,我收到了一个非类型错误。 这是我到目前为止的代码:
from nltk.corpus import brown
brown.fileids()
categories = brown.categories()
categories
news_text = brown.sents(categories='news')
editorial_text= brown.sents(categories='editorial')
romance_text= brown.sents(categories='romance')
target_text=news_text + editorial_text
total_text=news_text + editorial_text + romance_text
data=[]
for text in total_text:
if text in target_text:
label= "pos"
else:
label = "neg"
data.extend( [(label, text) for text in total_text] )
以下是我收到的错误消息:
AttributeError Traceback (most recent call last)
<ipython-input-9-f8709bb455fe> in <module>()
1 data=[]
2
----> 3 for text in total_text:
4 if text in target_text:
5 label= "pos"
/usr/local/lib/python3.5/dist-packages/nltk/collections.py in iterate_from(self, start_index)
330 'inconsistent list value (num elts)')
331
--> 332 for value in sublist[max(0, start_index-index):]:
333 yield value
334
/usr/local/lib/python3.5/dist-packages/nltk/collections.py in iterate_from(self, start_index)
330 'inconsistent list value (num elts)')
331
--> 332 for value in sublist[max(0, start_index-index):]:
333 yield value
334
/usr/local/lib/python3.5/dist-packages/nltk/corpus/reader/util.py in iterate_from(self, start_tok)
400
401 # Get everything we can from this piece.
--> 402 for tok in piece.iterate_from(max(0, start_tok-offset)):
403 yield tok
404
/usr/local/lib/python3.5/dist-packages/nltk/corpus/reader/util.py in iterate_from(self, start_tok)
291 while filepos < self._eofpos:
292 # Read the next block.
--> 293 self._stream.seek(filepos)
294 self._current_toknum = toknum
295 self._current_blocknum = block_index
AttributeError: 'NoneType' object has no attribute 'seek'
无论如何我可以解决这个问题吗?