我正在使用NLTK 3.2.1,java 1.8和stanford-postagger-3.6.0。
我下载了斯坦福标记器:
cd
wget http://nlp.stanford.edu/software/stanford-postagger-2015-12-09.zip
unzip stanford-postagger-2015-12-09.zip
并设置CLASSPATH
:export CLASSPATH=/home/ubuntu/stanford-postagger-2015-12-09
但我仍然收到错误消息:NLTK was unable to find stanford-postagger.jar! Set the CLASSPATH environment variable.
我检查了旧帖子,但没有一个工作。
>>> from nltk.tokenize import StanfordTokenizer
>>> s = "Good muffins cost $3.88\nin New York. Please buy me\ntwo of them.\nThanks."
>>> StanfordTokenizer().tokenize(s)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/nltk/tokenize/stanford.py", line 44, in __init__
verbose=verbose
File "/usr/local/lib/python2.7/dist-packages/nltk/internals.py", line 719, in find_jar
searchpath, url, verbose, is_regex))
File "/usr/local/lib/python2.7/dist-packages/nltk/internals.py", line 714, in find_jar_iter
raise LookupError('\n\n%s\n%s\n%s' % (div, msg, div))
LookupError:
===========================================================================
NLTK was unable to find stanford-postagger.jar! Set the CLASSPATH
environment variable.
For more information, on stanford-postagger.jar, see:
<http://nlp.stanford.edu/software/tokenizer.shtml>
===========================================================================
我有一个解决方法:
import os
os.environ['CLASSPATH'] = "/home/ubuntu/stanford-postagger-2015-12-09"
但为什么export CLASSPATH
不起作用?