我已经设置了nltk
和stanford
个环境,nltk
和stanford
个坛已下载,nltk
的程序还可以,但我遇到了stanford
分段器的麻烦。只需通过stanford
分段器创建一个简单的程序,我收到错误是在类路径中找不到SLF4J
,尽管我已经导出了包括slf4j-api.jar
在内的所有jar。详情如下
环境变量:
export JAVA_HOME=/usr/java/jdk1.8.0_60
export NLTK_DATA=/opt/nltk_data
export STANFORD_SEGMENTER_PATH=/opt/stanford/stanford-segmenter-3.7
export CLASSPATH=$CLASSPATH:$STANFORD_SEGMENTER_PATH/stanford-segmenter.jar
export STANFORD_POSTAGGER_PATH=/opt/stanford/stanford-postagger-full-2016-10-31
export CLASSPATH=$CLASSPATH:$STANFORD_POSTAGGER_PATH/stanford-postagger.jar
export STANFORD_NER_PATH=/opt/stanford/stanford-ner-2016-10-31
export CLASSPATH=$CLASSPATH:$STANFORD_NER_PATH/stanford-ner.jar
export STANFORD_MODELS=$STANFORD_NER_PATH/classifiers:$STANFORD_POSTAGGER_PATH/models
export STANFORD_PARSER_PATH=/opt/stanford/stanford-parser-full-2016-10-31
export CLASSPATH=$CLASSPATH:$STANFORD_PARSER_PATH/stanford-parser.jar:$STANFORD_PARSER_PATH/stanford-parser-3.6.0-models.jar:$STANFORD_PARSER_PATH/slf4j-api.jar:$STANFORD_PARSER_PATH/ejml-0.23.jar
export STANFORD_CORENLP_PATH=/opt/stanford/stanford-corenlp-full-2016-10-31
export CLASSPATH=$CLASSPATH:$STANFORD_CORENLP_PATH/stanford-corenlp-3.7.0.jar:$STANFORD_CORENLP_PATH/stanford-corenlp-3.7.0-models.jar:$STANFORD_CORENLP_PATH/javax.json.jar:$STANFORD_CORENLP_PATH/joda-time.jar:$STANFORD_CORENLP_PATH/jollyday.jar:$STANFORD_CORENLP_PATH/protobuf.jar:$STANFORD_CORENLP_PATH/slf4j-simple.jar:$STANFORD_CORENLP_PATH/xom.jar
export STANFORD_CORENLP=$STANFORD_CORENLP_PATH
该计划如下:
from nltk.tokenize import StanfordSegmenter
>>> segmenter = StanfordSegmenter(
path_to_sihan_corpora_dict="/opt/stanford/stanford-segmenter-3.7/data/",
path_to_model="/opt/stanford/stanford-segmenter-3.7/data/pku.gz",
path_to_dict="/opt/stanford/stanford-segmenter-3.7/data/dict-chris6.ser.gz"
)... ... ... ...
>>> res = segmenter.segment(u"北海已成为中国对外开放中升起的一颗明星")
错误如下:
Exception in thread "main" java.lang.ExceptionInInitializerError
at edu.stanford.nlp.ie.AbstractSequenceClassifier.<clinit>(AbstractSequenceClassifier.java:88)
Caused by: java.lang.IllegalStateException: Could not find SLF4J in your classpath
at edu.stanford.nlp.util.logging.RedwoodConfiguration$Handlers.lambda$static$530(RedwoodConfiguration.java:190)
at edu.stanford.nlp.util.logging.RedwoodConfiguration$Handlers$7.buildChain(RedwoodConfiguration.java:309)
at edu.stanford.nlp.util.logging.RedwoodConfiguration$Handlers$7.apply(RedwoodConfiguration.java:318)
at edu.stanford.nlp.util.logging.RedwoodConfiguration.lambda$handlers$535(RedwoodConfiguration.java:363)
at edu.stanford.nlp.util.logging.RedwoodConfiguration.apply(RedwoodConfiguration.java:41)
at edu.stanford.nlp.util.logging.Redwood.<clinit>(Redwood.java:609)
... 1 more
Caused by: edu.stanford.nlp.util.MetaClass$ClassCreationException: java.lang.ClassNotFoundException: edu.stanford.nlp.util.logging.SLF4JHandler
at edu.stanford.nlp.util.MetaClass.createFactory(MetaClass.java:364)
at edu.stanford.nlp.util.MetaClass.createInstance(MetaClass.java:381)
at edu.stanford.nlp.util.logging.RedwoodConfiguration$Handlers.lambda$static$530(RedwoodConfiguration.java:186)
... 6 more
Caused by: java.lang.ClassNotFoundException: edu.stanford.nlp.util.logging.SLF4JHandler
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at edu.stanford.nlp.util.MetaClass$ClassFactory.construct(MetaClass.java:135)
at edu.stanford.nlp.util.MetaClass$ClassFactory.<init>(MetaClass.java:202)
at edu.stanford.nlp.util.MetaClass$ClassFactory.<init>(MetaClass.java:69)
at edu.stanford.nlp.util.MetaClass.createFactory(MetaClass.java:360)
... 8 more
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/python3/lib/python3.5/site-packages/nltk/tokenize/stanford_segmenter.py", line 96, in segment
return self.segment_sents([tokens])
File "/usr/local/python3/lib/python3.5/site-packages/nltk/tokenize/stanford_segmenter.py", line 123, in segment_sents
stdout = self._execute(cmd)
File "/usr/local/python3/lib/python3.5/site-packages/nltk/tokenize/stanford_segmenter.py", line 143, in _execute
cmd,classpath=self._stanford_jar, stdout=PIPE, stderr=PIPE)
File "/usr/local/python3/lib/python3.5/site-packages/nltk/internals.py", line 134, in java
raise OSError('Java command failed : ' + str(cmd))
OSError: Java command failed : ['/usr/java/jdk1.8.0_60/bin/java', '-mx2g', '-cp', '/opt/stanford/stanford-segmenter-3.7/stanford-segmenter.jar:/opt/stanford/stanford-parser-full-2016-10-31/slf4j-api.jar', 'edu.stanford.nlp.ie.crf.CRFClassifier', '-sighanCorporaDict', '/opt/stanford/stanford-segmenter-3.7/data/', '-textFile', '/tmp/tmpkttpldl6', '-sighanPostProcessing', 'true', '-keepAllWhitespaces', 'false', '-loadClassifier', '/opt/stanford/stanford-segmenter-3.7/data/pku.gz', '-serDictionary', '/opt/stanford/stanford-segmenter-3.7/data/dict-chris6.ser.gz', '-inputEncoding', 'UTF-8']
提前谢谢!
答案 0 :(得分:1)
使用当前代码库,如果您的CLASSPATH中有slf4j-api.jar并运行3.7.0分段程序,您将收到此错误。我将推动代码更改以解决此问题,但暂时如果从CLASSPATH中删除slf4j-api.jar,此错误应该消失。