斯坦福大学关系培训无效

时间:2016-01-14 04:30:31

标签: stanford-nlp

我正在尝试运行URL http://nlp.stanford.edu/software/relationExtractor.shtml中指定的关系培训师。但是,虽然我在类路径中指定了stanford-postagger.jar,但它无法找到标记器。任何指向正确方向的指针都会非常有用。

我在Windows上运行命令提示符,如下所示:

  

D:\ 01.Jars \ Jars_Stanford \ stanford-corenlp-full-2015-04-20> java -cp   “stanford-ner .jar; stanford-corenlp-3.5.2.jar; stanford-postagger.jar”   edu.stanford.nlp.ie.mach inereading.MachineReading --arguments   SuperAnnuation.properties列车百分比:1.0读者日志   level设置为SEVERE在线程“main”中添加注释器pos异常   java.lang.RuntimeException:edu.stanford.nlp.io.Runti meIOException:   加载标记模型时出现无法恢复的错误           在edu.stanford.nlp.pipeline.AnnotatorFactories $ 4.create(AnnotatorFactor)   ies.java:292)           在edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:85)           在edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.j)   AVA:289)           在edu.stanford.nlp.pipeline.StanfordCoreNLP。(StanfordCoreNLP.java   :126)           在edu.stanford.nlp.ie.machinereading.MachineReading.makeMachineReading(   MachineReading.java:228)           在edu.stanford.nlp.ie.machinereading.MachineReading.main(MachineReading   .java:106)引起:edu.stanford.nlp.io.RuntimeIOException:   在使用标记器模型时出现无法恢复的错误           在edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTa)   gger.java:770)           在edu.stanford.nlp.tagger.maxent.MaxentTagger。(MaxentTagger.java:   298)           在edu.stanford.nlp.tagger.maxent.MaxentTagger。(MaxentTagger.java:   263)           在edu.stanford.nlp.pipeline.POSTaggerAnnotator.loadModel(POSTaggerAnnot   ator.java:97)           在edu.stanford.nlp.pipeline.POSTaggerAnnotator。(POSTaggerAnnotato   r.java:77)           在edu.stanford.nlp.pipeline.AnnotatorImplementations.posTagger(Annotato)   rImplementations.java:59)           在edu.stanford.nlp.pipeline.AnnotatorFactories $ 4.create(AnnotatorFactor)   ies.java:290)           ... 5更多引起:java.io.IOException:无法解析“edu / stanford / nlp / models / pos-t   agger / english-left3words / english-left3words-distsim.tagger“as as   类路径,文件名或URL           在edu.stanford.nlp.io.IOUtils.getInputStreamFromURLOrClasspathOrFileSys   TEM(IOUtils.java:481)           在edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTa)   gger.java:765)           ......还有11个

使用的Superannuation Property文件如下。这是网站上提供的默认属性文件:

#Below are some basic options. See edu.stanford.nlp.ie.machinereading.MachineReadingProperties class for more options.

# Pipeline options
annotators = pos, lemma, parse
parse.maxlen = 100

# MachineReading properties. You need one class to read the dataset into correct format. See edu.stanford.nlp.ie.machinereading.domains.ace.AceReader for another example.
datasetReaderClass = edu.stanford.nlp.ie.machinereading.domains.roth.RothCONLL04Reader

#Data directory for training. The datasetReaderClass reads data from this path and makes corresponding sentences and annotations.
trainPath = /u/nlp/data/RothCONLL04/conll04.corp

#Whether to crossValidate, that is evaluate, or just train.
crossValidate = false
kfold = 10

#Change this to true if you want to use CoreNLP pipeline generated NER tags. The default model generated with the relation extractor release uses the CoreNLP pipeline provided tags (option set to true).
trainUsePipelineNER=false

# where to save training sentences. uses the file if it exists, otherwise creates it.
serializedTrainingSentencesPath = tmp/roth_sentences.ser

serializedEntityExtractorPath = tmp/roth_entity_model.ser

# where to store the output of the extractor (sentence objects with relations generated by the model). This is what you will use as the model when using 'relation' annotator in the CoreNLP pipeline.
serializedRelationExtractorPath = tmp/roth_relation_model_pipeline.ser

# uncomment to load a serialized model instead of retraining
# loadModel = true

#relationResultsPrinters = edu.stanford.nlp.ie.machinereading.RelationExtractorResultsPrinter,edu.stanford.nlp.ie.machinereading.domains.roth.RothResultsByRelation. For printing output of the model.
relationResultsPrinters = edu.stanford.nlp.ie.machinereading.RelationExtractorResultsPrinter

#In this domain, this is trivial since all the entities are given (or set using CoreNLP NER tagger).
entityClassifier = edu.stanford.nlp.ie.machinereading.domains.roth.RothEntityExtractor

extractRelations = true
extractEvents = false

#We are setting the entities beforehand so the model does not learn how to extract entities etc.
extractEntities = false

#Opposite of crossValidate. 
trainOnly=true

# The set chosen by feature selection using RothCONLL04:
relationFeatures = arg_words,arg_type,dependency_path_lowlevel,dependency_path_words,surface_path_POS,entities_between_args,full_tree_path

# The above features plus the features used in Bjorne BioNLP09:
# relationFeatures = arg_words,arg_type,dependency_path_lowlevel,dependency_path_words,surface_path_POS,entities_between_args,full_tree_path,dependency_path_POS_unigrams,dependency_path_word_n_grams,dependency_path_POS_n_grams,dependency_path_edge_lowlevel_n_grams,dependency_path_edge-node-edge-grams_lowlevel,dependency_path_node-edge-node-grams_lowlevel,dependency_path_directed_bigrams,dependency_path_edge_unigrams,same_head,entity_counts

1 个答案:

答案 0 :(得分:1)

尝试使用完整的Stanford CoreNLP jar和相关模型jar。这些都可以从CoreNLP downloads page下载。确保在类路径中包含代码jar和模型jar!