如何用Python标记句子

时间:2012-10-14 06:14:05

标签: python nltk

  

可能重复:
  Failed loading english.pickle with nltk.data.load

这是我想要进行POS标记时遇到的问题,即使我已经导入了所需的项目。所以不确定什么是无法打印输出的问题。任何人都可以帮我指出我的代码有什么问题吗?

>>> import nltk
>>> import nltk.corpus
>>> from nltk.corpus import brown
>>> from nltk.corpus import treebank
>>> import nltk.tag
>>> from nltk import tokenize
>>> from nltk import word_tokenize
>>> from nltk import pos_tag
>>> text=nltk.word_tokenize("Historians have scant knowledge about Borneo's earl
y history, a certain fact though is the presence of modern man in Sarawak some 4
0,000 years ago (discovery of a Homo Sapiens skull at the Niah Caves), but most
of today's indigenous populations belong to the same Austronesian groups, brough
t by maritime migratory waves in the last 5,000 or so years, who have settled al
ong the Malayan peninsula, the Indonesian, Philippine, Micronesian and Polynesia
n archipelagos, and as far as Madagascar to the west and Easter Island to the ea
st.")
 >>> nltk.pos_tag(text)

错误:

Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "C:\Python27\lib\site-packages\nltk\tag\__init__.py", line 99, in pos_tag

        tagger = load(_POS_TAGGER)
    File "C:\Python27\lib\site-packages\nltk\data.py", line 605, in load
        resource_val = pickle.load(_open(resource_url))
    File "C:\Python27\lib\site-packages\nltk\data.py", line 686, in _open
        return find(path).open()
    File "C:\Python27\lib\site-packages\nltk\data.py", line 467, in find
        raise LookupError(resource_not_found)
LookupError:
**********************************************************************
    Resource 'taggers/maxent_treebank_pos_tagger/english.pickle' not
    found.  Please use the NLTK Downloader to obtain the resource:
    >>> nltk.download()
    Searched in:
        - 'C:\\Users\\user/nltk_data'
        - 'C:\\nltk_data'
        - 'D:\\nltk_data'
        - 'E:\\nltk_data'
        - 'C:\\Python27\\nltk_data'
        - 'C:\\Python27\\lib\\nltk_data'
        - 'C:\\Users\\user\\AppData\\Roaming\\nltk_data'
**********************************************************************

1 个答案:

答案 0 :(得分:4)

如错误所示,您需要使用NLTK下载程序下载资源taggers/maxent_treebank_pos_tagger/english.pickle

您可以通过从Python shell运行import nltk; nltk.download()来执行此操作。您需要的文件位于名为maxent_treebank_pos_tagger的模型选项卡下。