在Ubuntu OS中的Python词性标记错误

时间:2013-05-30 05:54:08

标签: python pos-tagger

我用于POS标记的python代码:

>>> import nltk, csv, itertools
>>> sentence = "Unigram taggers are based on a simple statistical algorithm: for each token, assign the tag that is most likely for that particular token."
>>> tokens = nltk.word_tokenize(sentence)
>>> tags = nltk.pos_tag(tokens)
and the error shown is:
>>> tags = nltk.pos_tag(tokens)
Traceback (most recent call last):
  File "<pyshell#7>", line 1, in <module>
    tags = nltk.pos_tag(tokens)
  File "/usr/local/lib/python2.7/dist-packages/nltk/tag/__init__.py", line 99, in pos_tag
    tagger = load(_POS_TAGGER)
  File "/usr/local/lib/python2.7/dist-packages/nltk/data.py", line 605, in load
    resource_val = pickle.load(_open(resource_url))
  File "/usr/local/lib/python2.7/dist-packages/nltk/data.py", line 686, in _open
    return find(path).open()
  File "/usr/local/lib/python2.7/dist-packages/nltk/data.py", line 455, in find
    try: return find(modified_name)
  File "/usr/local/lib/python2.7/dist-packages/nltk/data.py", line 445, in find
    try: return ZipFilePathPointer(p, zipentry)
  File "/usr/local/lib/python2.7/dist-packages/nltk/data.py", line 311, in __init__
    zipfile = OpenOnDemandZipFile(os.path.abspath(zipfile))
  File "/usr/local/lib/python2.7/dist-packages/nltk/data.py", line 738, in __init__
    zipfile.ZipFile.__init__(self, filename)
  File "/usr/lib/python2.7/zipfile.py", line 714, in __init__
    self._GetContents()
  File "/usr/lib/python2.7/zipfile.py", line 748, in _GetContents
    self._RealGetContents()
  File "/usr/lib/python2.7/zipfile.py", line 763, in _RealGetContents
    raise BadZipfile, "File is not a zip file"
BadZipfile: File is not a zip file

是否包含任何python模块?

解决方案是什么?

1 个答案:

答案 0 :(得分:0)

而不是使用pos_tag

应用此

nltk.download("maxent_treebank_pos_tagger")
nltk.download("maxent_ne_chunker")
nltk.download("punkt")

前两个用于pos_tag,最后一个用于send_tokenizer