得到'NLTK无法找到stanford-postagger.jar!'使用斯坦福标记器时出错

时间:2016-10-26 06:31:48

标签: python nltk stanford-nlp

我正在使用NLTK 3.2.1,java 1.8和stanford-postagger-3.6.0。

我下载了斯坦福标记器:

cd
wget http://nlp.stanford.edu/software/stanford-postagger-2015-12-09.zip
unzip stanford-postagger-2015-12-09.zip

并设置CLASSPATHexport CLASSPATH=/home/ubuntu/stanford-postagger-2015-12-09

但我仍然收到错误消息:NLTK was unable to find stanford-postagger.jar! Set the CLASSPATH environment variable.

我检查了旧帖子,但没有一个工作。

>>> from nltk.tokenize import StanfordTokenizer
>>> s = "Good muffins cost $3.88\nin New York.  Please buy me\ntwo of them.\nThanks."
>>> StanfordTokenizer().tokenize(s)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/nltk/tokenize/stanford.py", line 44, in __init__
    verbose=verbose
  File "/usr/local/lib/python2.7/dist-packages/nltk/internals.py", line 719, in find_jar
    searchpath, url, verbose, is_regex))
  File "/usr/local/lib/python2.7/dist-packages/nltk/internals.py", line 714, in find_jar_iter
    raise LookupError('\n\n%s\n%s\n%s' % (div, msg, div))
LookupError: 

===========================================================================
  NLTK was unable to find stanford-postagger.jar! Set the CLASSPATH
  environment variable.

  For more information, on stanford-postagger.jar, see:
    <http://nlp.stanford.edu/software/tokenizer.shtml>
===========================================================================

我有一个解决方法:

import os
os.environ['CLASSPATH'] = "/home/ubuntu/stanford-postagger-2015-12-09"

但为什么export CLASSPATH不起作用?

0 个答案:

没有答案