Python-非类型错误的文本分类

时间:2017-04-12 23:22:26

标签: python nltk text-mining text-classification corpus

我正在开发一个关于文本分类的Python基础项目。 我正在使用nltk,我已经进口了它的布朗语料库。 在尝试将一个组分类为“正面”而将另一个组分类为“负面”时,我收到了一个非类型错误。 这是我到目前为止的代码:

from nltk.corpus import brown
brown.fileids()

categories = brown.categories()
categories

news_text = brown.sents(categories='news')
editorial_text= brown.sents(categories='editorial')
romance_text= brown.sents(categories='romance')
target_text=news_text + editorial_text

total_text=news_text + editorial_text + romance_text

data=[]

for text in total_text:
    if text in target_text:
        label= "pos"
    else:
        label = "neg"

data.extend( [(label, text) for text in total_text] )

以下是我收到的错误消息:

AttributeError                            Traceback (most recent call last)
<ipython-input-9-f8709bb455fe> in <module>()
      1 data=[]
      2 
----> 3 for text in total_text:
      4     if text in target_text:
      5         label= "pos"

/usr/local/lib/python3.5/dist-packages/nltk/collections.py in iterate_from(self, start_index)
    330                         'inconsistent list value (num elts)')
    331 
--> 332             for value in sublist[max(0, start_index-index):]:
    333                 yield value
    334 

/usr/local/lib/python3.5/dist-packages/nltk/collections.py in iterate_from(self, start_index)
    330                         'inconsistent list value (num elts)')
    331 
--> 332             for value in sublist[max(0, start_index-index):]:
    333                 yield value
    334 

/usr/local/lib/python3.5/dist-packages/nltk/corpus/reader/util.py in iterate_from(self, start_tok)
    400 
    401             # Get everything we can from this piece.
--> 402             for tok in piece.iterate_from(max(0, start_tok-offset)):
    403                 yield tok
    404 

/usr/local/lib/python3.5/dist-packages/nltk/corpus/reader/util.py in iterate_from(self, start_tok)
    291         while filepos < self._eofpos:
    292             # Read the next block.
--> 293             self._stream.seek(filepos)
    294             self._current_toknum = toknum
    295             self._current_blocknum = block_index

AttributeError: 'NoneType' object has no attribute 'seek'

无论如何我可以解决这个问题吗?

0 个答案:

没有答案