使用nltk对大型数据集进行POS标记然后在python 2.7中进行词形变换(情感分析)

时间:2016-06-29 12:54:19

标签: python nltk pos-tagger

我试图对某些数据进行情绪分析。我想在使用我的算法进行情绪分析之前标记数据的POS并将其解释。 该算法在没有POS标记和引理的情况下工作。

当我尝试使用nltk.pos_tag()进行POS标记时,我收到了错误的zip文件错误。 我该如何解决这个问题?

scikit会不会更好地选择进行这种情绪分析?

`

Traceback (most recent call last):
  File "C:/Users/Janak/Desktop/NaiveBayesFin.py", line 88, in <module>
    evaluate_features(make_full_dict)
  File "C:/Users/Janak/Desktop/NaiveBayesFin.py", line 42, in evaluate_features
    tagged_words = nltk.pos_tag(posWords)
  File "C:\Python27\lib\site-packages\nltk\tag\__init__.py", line 110, in pos_tag
    tagger = PerceptronTagger()
  File "C:\Python27\lib\site-packages\nltk\tag\perceptron.py", line 140, in __init__
    AP_MODEL_LOC = 'file:'+str(find('taggers/averaged_perceptron_tagger/'+PICKLE))
  File "C:\Python27\lib\site-packages\nltk\data.py", line 628, in find
    return find(modified_name, paths)
  File "C:\Python27\lib\site-packages\nltk\data.py", line 614, in find
    return ZipFilePathPointer(p, zipentry)
  File "C:\Python27\lib\site-packages\nltk\compat.py", line 561, in _decorator
    return init_func(*args, **kwargs)
  File "C:\Python27\lib\site-packages\nltk\data.py", line 469, in __init__
    zipfile = OpenOnDemandZipFile(os.path.abspath(zipfile))
  File "C:\Python27\lib\site-packages\nltk\compat.py", line 561, in _decorator
    return init_func(*args, **kwargs)
  File "C:\Python27\lib\site-packages\nltk\data.py", line 979, in __init__
    zipfile.ZipFile.__init__(self, filename)
  File "C:\Python27\lib\zipfile.py", line 770, in __init__
    self._RealGetContents()
  File "C:\Python27\lib\zipfile.py", line 811, in _RealGetContents
    raise BadZipfile, "File is not a zip file"
BadZipfile: File is not a zip file

`

0 个答案:

没有答案