我目前正在Windows 10桌面上工作。我正在尝试将Java转换为Python。
Java使用以下代码直接调用StanfordCoreNLP pipeline
...
Properties props = new Properties();
props.put("annotators", "tokenize, ssplit, pos, lemma, ner, regexner");
props.put("ner.model", serializedClassifier);
props.put("pos.model", posModel);
props.put("tokenize.language", "de");
props.put("ssplit.isOneSentence", "true");
props.put("ssplit.language", "de");
props.put("lemma.language", "de");
props.put("regexner.mapping", Init.REGEXNER);
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
// Subsequent calls to pipeline...
我需要将其更改为可访问Windows框上运行的StanfordCoreNLPServer
服务器的Python代码,该命令以... ... p开头
java -mx4g --add-modules java.se.ee -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -props "c:\Users\(snip)\(snip)\project.properties"
我已经像这样创建了一个属性文件(project.properties)...
annotators = tokenize, ssplit, pos, lemma, ner, regexner
ner.model = "/edu/stanford/nlp/models/ner/german.conll.hgc_175m_600.crf.ser.gz
pos.model = posModel
tokenize.language = de
ssplit.isOneSentence = true
ssplit.language = de
lemma.language = de
regexner.mapping = /misc/regex_mappings_surnames.txt,misc/regex_mappings_general.txt,misc/regex_mappings_towns_and_villages.txt,misc/regex_mappings_streets_with_numbers_and_abr.txt,misc/regex_mappings_international_firstnames.txt
outputExtens on = .output
port = 9000
timeout = 30000
但是我不确定它实际上是否正确设置了所有参数,尤其是与文件/ java路径关联的参数。
例如/misc/regex_mappings_surnames.txt
居住在C:\Users\(snip)\(snip)\Eclipseworkspace\eclipse_project_name\misc\
和/edu/stanford/nlp/models/ner/german.conll.hgc_175m_600.crf.ser.gz
(我认为)生活在C:\Users\(snip)\(snip)\stanford-corenlp-full-2017-06-09\stanford-german-corenlp-2017-06-09-models.jar
中(不是在Eclipse项目目录中,而是在我开始StanfordCoreNLPServer
的根目录中)。
启动时服务器输出是...
C:\> cd C:\Users\(snip)\(snip)\stanford-corenlp-full-2017-06-09\
C:\Users\(snip)\(snip)\stanford-corenlp-full-2017-06-09> java -mx4g --add-modules java.se.ee -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -props "c:\Users\(snip)\(snip)\project.properties"
[main] INFO CoreNLP - --- StanfordCoreNLPServer #main() called ---
[main] INFO CoreNLP - setting default constituency parser
[main] INFO CoreNLP - warning: cannot find edu/stanford/nlp/models/srparser/englishSR.ser.gz
[main] INFO CoreNLP - using: edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz instead
[main] INFO CoreNLP - to use shift reduce parser download English models jar from:
[main] INFO CoreNLP - http://stanfordnlp.github.io/CoreNLP/download.html
[main] INFO CoreNLP - Threads: 4
[main] INFO CoreNLP - Starting server...
[main] INFO CoreNLP - StanfordCoreNLPServer listening at /0:0:0:0:0:0:0:0:9000
...这就是为什么我不确定使用正确的设置的原因。我的代码使用StanfordCoreNLPServer
产生输出,但不是完全正确的输出。我不熟悉StanfordNLP,StanfrdNLPServer或Java,所以不要以为我知道我应该知道的东西。