我正在尝试使用birthplace model在Stanford CoreNLP中训练自定义关系。
我已经遍历了这个documentation,详细介绍了我们如何制作一个属性文件(类似于roth.properties),如下所示:
#Below are some basic options. See edu.stanford.nlp.ie.machinereading.MachineReadingProperties class for more options.
# Pipeline options
annotators = pos, lemma, parse
parse.maxlen = 100
# MachineReading properties. You need one class to read the dataset into correct format. See edu.stanford.nlp.ie.machinereading.domains.ace.AceReader for another example.
datasetReaderClass = edu.stanford.nlp.ie.machinereading.domains.roth.RothCONLL04Reader
#Data directory for training. The datasetReaderClass reads data from this path and makes corresponding sentences and annotations.
trainPath = "D:\\stanford-corenlp-full-2017-06-09\\birthplace.corp"
#Whether to crossValidate, that is evaluate, or just train.
crossValidate = false
kfold = 10
#Change this to true if you want to use CoreNLP pipeline generated NER tags. The default model generated with the relation extractor release uses the CoreNLP pipeline provided tags (option set to true).
trainUsePipelineNER=false
# where to save training sentences. uses the file if it exists, otherwise creates it.
serializedTrainingSentencesPath = "D:\\stanford-corenlp-full-2017-06-09\\rel\\sentences.ser"
serializedEntityExtractorPath = "D:\\stanford-corenlp-full-2017-06-09\\rel\\entity_model.ser"
# where to store the output of the extractor (sentence objects with relations generated by the model). This is what you will use as the model when using 'relation' annotator in the CoreNLP pipeline.
serializedRelationExtractorPath = "D:\\stanford-corenlp-full-2017-06-09\\rel\\roth_relation_model_pipeline.ser"
# uncomment to load a serialized model instead of retraining
# loadModel = true
#relationResultsPrinters = edu.stanford.nlp.ie.machinereading.RelationExtractorResultsPrinter,edu.stanford.nlp.ie.machinereading.domains.roth.RothResultsByRelation. For printing output of the model.
relationResultsPrinters = edu.stanford.nlp.ie.machinereading.RelationExtractorResultsPrinter
#In this domain, this is trivial since all the entities are given (or set using CoreNLP NER tagger).
entityClassifier = edu.stanford.nlp.ie.machinereading.domains.roth.RothEntityExtractor
extractRelations = true
extractEvents = false
#We are setting the entities beforehand so the model does not learn how to extract entities etc.
extractEntities = false
#Opposite of crossValidate.
trainOnly=true
# The set chosen by feature selection using RothCONLL04:
relationFeatures = arg_words,arg_type,dependency_path_lowlevel,dependency_path_words,surface_path_POS,entities_between_args,full_tree_path
# The above features plus the features used in Bjorne BioNLP09:
# relationFeatures = arg_words,arg_type,dependency_path_lowlevel,dependency_path_words,surface_path_POS,entities_between_args,full_tree_path,dependency_path_POS_unigrams,dependency_path_word_n_grams,dependency_path_POS_n_grams,dependency_path_edge_lowlevel_n_grams,dependency_path_edge-node-edge-grams_lowlevel,dependency_path_node-edge-node-grams_lowlevel,dependency_path_directed_bigrams,dependency_path_edge_unigrams,same_head,entity_counts
我正在目录D:\stanford-corenlp-full-2017-06-09
中执行此命令:
D:\stanford-corenlp-full-2017-06-09\stanford-corenlp-3.8.0\edu\stanford\nlp>java -cp classpath edu.stanford.nlp.ie.machinereading.MachineReading --arguments roth.properties
我收到此错误
Error: Could not find or load main class edu.stanford.nlp.ie.machinereading.MachineReading Caused by: java.lang.ClassNotFoundException: edu.stanford.nlp.ie.machinereading.MachineReading
我还尝试使用以下C#代码以编程方式训练自定义关系模型:
using java.util;
using System.Collections.Generic;
namespace StanfordRelationDemo
{
class Program
{
static void Main(string[] args)
{
string jarRoot = @"D:\Stanford English Model\stanford-english-corenlp-2018-10-05-models\";
string modelsDirectory = jarRoot + @"edu\stanford\nlp\models";
string sutimeRules = modelsDirectory + @"\sutime\defs.sutime.txt,"
//+ modelsDirectory + @"\sutime\english.holidays.sutime.txt,"
+ modelsDirectory + @"\sutime\english.sutime.txt";
Properties props = new Properties();
props.setProperty("annotators", "pos, lemma, parse");
props.setProperty("parse.maxlen", "100");
props.setProperty("datasetReaderClass", "edu.stanford.nlp.ie.machinereading.domains.roth.RothCONLL04Reader");
props.setProperty("trainPath", "D://Stanford English Model//stanford-english-corenlp-2018-10-05-models//edu//stanford//nlp//models//birthplace.corp");
props.setProperty("crossValidate", "false");
props.setProperty("kfold", "10");
props.setProperty("trainOnly", "true");
props.setProperty("trainUsePipelineNER", "true");
props.setProperty("serializedTrainingSentencesPath", "D://Stanford English Model//stanford-english-corenlp-2018-10-05-models//edu//stanford//nlp//models//rel//sentences.ser");
props.setProperty("serializedEntityExtractorPath", "D://Stanford English Model//stanford-english-corenlp-2018-10-05-models//edu//stanford//nlp//models//rel//entity_model.ser");
props.setProperty("serializedRelationExtractorPath", "D://Stanford English Model//stanford-english-corenlp-2018-10-05-models//edu//stanford//nlp//models//rel//roth_relation_model_pipeline.ser");
props.setProperty("relationResultsPrinters", "edu.stanford.nlp.ie.machinereading.RelationExtractorResultsPrinter");
props.setProperty("entityClassifier", "edu.stanford.nlp.ie.machinereading.domains.roth.RothEntityExtractor");
props.setProperty("extractRelations", "true");
props.setProperty("extractEvents", "false");
props.setProperty("extractEntities", "false");
props.setProperty("trainOnly", "true");
props.setProperty("relationFeatures", "arg_words,arg_type,dependency_path_lowlevel,dependency_path_words,surface_path_POS,entities_between_args,full_tree_path");
var propertyKeys = props.keys();
var propertyStringArray = new List<string>();
while (propertyKeys.hasMoreElements())
{
var key = propertyKeys.nextElement();
propertyStringArray.Add($"-{key}");
propertyStringArray.Add(props.getProperty(key.ToString(), string.Empty));
}
var machineReader = edu.stanford.nlp.ie.machinereading.MachineReading.makeMachineReading(propertyStringArray.ToArray());
var utestResultList = machineReader.run();
}
}
}
我遇到此异常:
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. Unhandled Exception: edu.stanford.nlp.io.RuntimeIOException: Error while loading a tagger model (probably missing model file) --->
java.io.IOException:无法打开 “ edu / stanford / nlp / models / pos-tagger / english-left3words / english-left3words-distsim.tagger” 作为类路径,文件名或URL 在edu.stanford.nlp.io.IOUtils.getInputStreamFromURLOrClasspathOrFileSystem(String textFileOrUrl) 在edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(Properties config,String modelFileOrUrl,Boolean printLoading) ---内部异常堆栈跟踪的结尾--- 在edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(Properties config,String modelFileOrUrl,Boolean printLoading) 在edu.stanford.nlp.tagger.maxent.MaxentTagger..ctor(字符串modelFile,Properties config,Boolean printLoading) 在edu.stanford.nlp.tagger.maxent.MaxentTagger..ctor(字符串modelFile) 在edu.stanford.nlp.pipeline.POSTaggerAnnotator.loadModel(String, 布尔值) 在edu.stanford.nlp.pipeline.POSTaggerAnnotator..ctor中(字符串annotatorName,属性道具) 在edu.stanford.nlp.pipeline.AnnotatorImplementations.posTagger(Properties 属性) 在edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda $ getNamedAnnotators $ 42(Properties ,AnnotatorImplementations) 在edu.stanford.nlp.pipeline.StanfordCoreNLP。 <> Anon4.apply(Object, 对象) 在edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda $ getDefaultAnnotatorPool $ 65(Entry ,属性,AnnotatorImplementations) 在edu.stanford.nlp.pipeline.StanfordCoreNLP上。 <> Anon27.get() 在edu.stanford.nlp.util.Lazy.3.compute() 在edu.stanford.nlp.util.Lazy.get() 在edu.stanford.nlp.pipeline.AnnotatorPool.get(字符串名称) 在edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(Properties, Boolean,AnnotatorImplementations,AnnotatorPool) 在edu.stanford.nlp.pipeline.StanfordCoreNLP..ctor处(属性道具,布尔值forceRequirements,AnnotatorPool和annotatorPool) 在edu.stanford.nlp.pipeline.StanfordCoreNLP..ctor处(属性props,布尔forceRequirements) 在edu.stanford.nlp.ie.machinereading.MachineReading.makeMachineReading(String [] args) 在C:\ Users \ m1039332 \ Documents \ Visual Studio中的StanfordRelationDemo.Program.Main(String [] args) 2017 \ Projects \ StanfordRelationDemo \ StanfordRelationDemo \ Program.cs:line 46
因此,我根本无法使用CoreNLP来训练自定义关系,如果我犯了任何明显的错误,我将不胜感激。
答案 0 :(得分:0)
我不认为机器阅读代码随标准发行版一起发行。
您应该从完整的GitHub构建一个jar。
https://github.com/stanfordnlp/CoreNLP/tree/master/src/edu/stanford/nlp/ie/machinereading