Stanford segmenter nltk在类路径中找不到SLF4J

时间:2017-03-12 04:37:58

标签: python-3.x nltk stanford-nlp

我已经设置了nltkstanford个环境,nltkstanford个坛已下载,nltk的程序还可以,但我遇到了stanford分段器的麻烦。只需通过stanford分段器创建一个简单的程序,我收到错误是在类路径中找不到SLF4J,尽管我已经导出了包括slf4j-api.jar在内的所有jar。详情如下

  • Python3.5 NLTK 3.2.2 Standford jars 3.7
  • OS:Centos
  • 环境变量:

    export JAVA_HOME=/usr/java/jdk1.8.0_60
    export NLTK_DATA=/opt/nltk_data
    export STANFORD_SEGMENTER_PATH=/opt/stanford/stanford-segmenter-3.7
    export CLASSPATH=$CLASSPATH:$STANFORD_SEGMENTER_PATH/stanford-segmenter.jar
    export STANFORD_POSTAGGER_PATH=/opt/stanford/stanford-postagger-full-2016-10-31
    export CLASSPATH=$CLASSPATH:$STANFORD_POSTAGGER_PATH/stanford-postagger.jar
    export STANFORD_NER_PATH=/opt/stanford/stanford-ner-2016-10-31
    export CLASSPATH=$CLASSPATH:$STANFORD_NER_PATH/stanford-ner.jar
    export STANFORD_MODELS=$STANFORD_NER_PATH/classifiers:$STANFORD_POSTAGGER_PATH/models
    export STANFORD_PARSER_PATH=/opt/stanford/stanford-parser-full-2016-10-31
    export CLASSPATH=$CLASSPATH:$STANFORD_PARSER_PATH/stanford-parser.jar:$STANFORD_PARSER_PATH/stanford-parser-3.6.0-models.jar:$STANFORD_PARSER_PATH/slf4j-api.jar:$STANFORD_PARSER_PATH/ejml-0.23.jar
    export STANFORD_CORENLP_PATH=/opt/stanford/stanford-corenlp-full-2016-10-31
    export CLASSPATH=$CLASSPATH:$STANFORD_CORENLP_PATH/stanford-corenlp-3.7.0.jar:$STANFORD_CORENLP_PATH/stanford-corenlp-3.7.0-models.jar:$STANFORD_CORENLP_PATH/javax.json.jar:$STANFORD_CORENLP_PATH/joda-time.jar:$STANFORD_CORENLP_PATH/jollyday.jar:$STANFORD_CORENLP_PATH/protobuf.jar:$STANFORD_CORENLP_PATH/slf4j-simple.jar:$STANFORD_CORENLP_PATH/xom.jar
    export STANFORD_CORENLP=$STANFORD_CORENLP_PATH
    

该计划如下:

from nltk.tokenize import StanfordSegmenter
>>> segmenter = StanfordSegmenter(
    path_to_sihan_corpora_dict="/opt/stanford/stanford-segmenter-3.7/data/",
    path_to_model="/opt/stanford/stanford-segmenter-3.7/data/pku.gz",
    path_to_dict="/opt/stanford/stanford-segmenter-3.7/data/dict-chris6.ser.gz"
)... ... ... ... 
>>> res = segmenter.segment(u"北海已成为中国对外开放中升起的一颗明星")

错误如下:

Exception in thread "main" java.lang.ExceptionInInitializerError
    at edu.stanford.nlp.ie.AbstractSequenceClassifier.<clinit>(AbstractSequenceClassifier.java:88)
Caused by: java.lang.IllegalStateException: Could not find SLF4J in your classpath
    at edu.stanford.nlp.util.logging.RedwoodConfiguration$Handlers.lambda$static$530(RedwoodConfiguration.java:190)
    at edu.stanford.nlp.util.logging.RedwoodConfiguration$Handlers$7.buildChain(RedwoodConfiguration.java:309)
    at edu.stanford.nlp.util.logging.RedwoodConfiguration$Handlers$7.apply(RedwoodConfiguration.java:318)
    at edu.stanford.nlp.util.logging.RedwoodConfiguration.lambda$handlers$535(RedwoodConfiguration.java:363)
    at edu.stanford.nlp.util.logging.RedwoodConfiguration.apply(RedwoodConfiguration.java:41)
    at edu.stanford.nlp.util.logging.Redwood.<clinit>(Redwood.java:609)
    ... 1 more
Caused by: edu.stanford.nlp.util.MetaClass$ClassCreationException: java.lang.ClassNotFoundException: edu.stanford.nlp.util.logging.SLF4JHandler
    at edu.stanford.nlp.util.MetaClass.createFactory(MetaClass.java:364)
    at edu.stanford.nlp.util.MetaClass.createInstance(MetaClass.java:381)
    at edu.stanford.nlp.util.logging.RedwoodConfiguration$Handlers.lambda$static$530(RedwoodConfiguration.java:186)
    ... 6 more
Caused by: java.lang.ClassNotFoundException: edu.stanford.nlp.util.logging.SLF4JHandler
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:264)
    at edu.stanford.nlp.util.MetaClass$ClassFactory.construct(MetaClass.java:135)
    at edu.stanford.nlp.util.MetaClass$ClassFactory.<init>(MetaClass.java:202)
    at edu.stanford.nlp.util.MetaClass$ClassFactory.<init>(MetaClass.java:69)
    at edu.stanford.nlp.util.MetaClass.createFactory(MetaClass.java:360)
    ... 8 more

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/python3/lib/python3.5/site-packages/nltk/tokenize/stanford_segmenter.py", line 96, in segment
    return self.segment_sents([tokens])
  File "/usr/local/python3/lib/python3.5/site-packages/nltk/tokenize/stanford_segmenter.py", line 123, in segment_sents
    stdout = self._execute(cmd)
  File "/usr/local/python3/lib/python3.5/site-packages/nltk/tokenize/stanford_segmenter.py", line 143, in _execute
    cmd,classpath=self._stanford_jar, stdout=PIPE, stderr=PIPE)
  File "/usr/local/python3/lib/python3.5/site-packages/nltk/internals.py", line 134, in java
    raise OSError('Java command failed : ' + str(cmd))
OSError: Java command failed : ['/usr/java/jdk1.8.0_60/bin/java', '-mx2g', '-cp', '/opt/stanford/stanford-segmenter-3.7/stanford-segmenter.jar:/opt/stanford/stanford-parser-full-2016-10-31/slf4j-api.jar', 'edu.stanford.nlp.ie.crf.CRFClassifier', '-sighanCorporaDict', '/opt/stanford/stanford-segmenter-3.7/data/', '-textFile', '/tmp/tmpkttpldl6', '-sighanPostProcessing', 'true', '-keepAllWhitespaces', 'false', '-loadClassifier', '/opt/stanford/stanford-segmenter-3.7/data/pku.gz', '-serDictionary', '/opt/stanford/stanford-segmenter-3.7/data/dict-chris6.ser.gz', '-inputEncoding', 'UTF-8']

提前谢谢!

1 个答案:

答案 0 :(得分:1)

使用当前代码库,如果您的CLASSPATH中有slf4j-api.jar并运行3.7.0分段程序,您将收到此错误。我将推动代码更改以解决此问题,但暂时如果从CLASSPATH中删除slf4j-api.jar,此错误应该消失。