有人可以指导我如何使用GATE源代码创建一个自定义 JAPE 文件并配置它。我尝试使用以下代码并在解析语法时遇到异常,例如“错误:”和“设置了grammarURL或binaryGrammarURL参数!”
try{
Document doc = new DocumentImpl();
String str = "This is test.";
DocumentContentImpl impl = new DocumentContentImpl(str);
doc.setContent(impl);
System.setProperty("gate.home", "C:\\Program Files\\GATE_Developer_7.1");
Gate.init();
gate.Corpus corpus = (Corpus) Factory
.createResource("gate.corpora.CorpusImpl");
File gateHome = Gate.getGateHome();
File pluginsHome = new File(gateHome, "plugins");
Gate.getCreoleRegister().registerDirectories(new File(pluginsHome, "ANNIE").toURI().toURL());
Transducer transducer = new Transducer();
transducer.setDocument(doc);
transducer.setGrammarURL(new URL("file:///D:/misc_workspace/gate-7.1-build4485-SRC/plugins/ANNIE/resources/NE/SportsCategory.jape"));
transducer.setBinaryGrammarURL(new URL("file:///D:/misc_workspace/gate-7.1-build4485-SRC/plugins/ANNIE/resources/NE/SportsCategory.jape"));
LanguageAnalyser jape = (LanguageAnalyser)Factory.createResource(
"gate.creole.Transducer", gate.Utils.featureMap(
"grammarURL", "D:/misc_workspace/gate-7.1-build4485-SRC/plugins/ANNIE/resources/NE/SportsCategory.jape",
"encoding", "UTF-8"));
答案 0 :(得分:3)
您需要加载ANNIE插件
Gate.getCreoleRegister().registerDirectories(
new File(Gate.getPluginsHome(), "ANNIE").toURI().toURL());
然后使用正确的参数
创建gate.creole.Transducer
的实例
LanguageAnalyser jape = (LanguageAnalyser)Factory.createResource(
"gate.creole.Transducer", gate.Utils.featureMap(
"grammarURL", new URL("file:///D:/path/to/my-grammar.jape"),
"encoding", "UTF-8")); // ensure this matches the file
但我们通常提倡的方法是在GATE Developer中按照您希望的方式组装和配置整个管道,使用您需要的任何标准组件以及您自己的语法,然后将应用程序状态保存到文件中。然后,您可以使用一行从代码重新加载整个应用程序
CorpusController app = (CorpusController) PersistenceManager.loadObjectFromFile(savedAppFile);
编辑:您添加到问题中的代码有几个基本问题。首先,在使用GATE执行任何其他操作之前,您必须先致电Gate.init()
- 在创建Document
之前必须。其次,您必须never call the constructor of a Resource
class directly - 始终使用Factory
。同样,您永远不需要直接致电init()
,因为这是Factory.createResource
的一部分。例如:
// initialise GATE
Gate.setGateHome(new File("C:\\Program Files\\GATE_Developer_7.1"));
Gate.init();
// load ANNIE plugin - you must do this before you can create tokeniser
// or JAPE transducer resources.
Gate.getCreoleRegister().registerDirectories(
new File(Gate.getPluginsHome(), "ANNIE").toURI().toURL());
// Build the pipeline
SerialAnalyserController pipeline =
(SerialAnalyserController)Factory.createResource(
"gate.creole.SerialAnalyserController");
LanguageAnalyser tokeniser = (LanguageAnalyser)Factory.createResource(
"gate.creole.tokeniser.DefaultTokeniser");
LanguageAnalyser jape = (LanguageAnalyser)Factory.createResource(
"gate.creole.Transducer", gate.Utils.featureMap(
"grammarURL", new File("D:\\path\\to\\my-grammar.jape").toURI().toURL(),
"encoding", "UTF-8")); // ensure this matches the file
pipeline.add(tokeniser);
pipeline.add(jape);
// create document and corpus
Corpus corpus = Factory.newCorpus(null);
Document doc = Factory.newDocument("This is test.");
corpus.add(doc);
pipeline.setCorpus(corpus);
// run it
pipeline.execute();
// extract results
System.out.println("Found annotations of the following types: " +
doc.getAnnotations().getAllTypes());
如果您还没有我强烈建议您至少完成training course materials模块5,这将显示加载文档并在其上运行处理资源的正确方法。
答案 1 :(得分:1)
System.setProperty("gate.home", "C:\\Program Files\\GATE_Developer_7.1");
Gate.init();
ProcessingResource token = (ProcessingResource) Factory.createResource("gate.creole.tokeniser.DefaultTokeniser",Factory.newFeatureMap());
String str = "This is a test. Myself Abhijit Nag sport";
Document doc = Factory.newDocument(str);
gate.Corpus corpus = (Corpus) Factory.createResource("gate.corpora.CorpusImpl");
corpus.add(doc);
File gateHome = Gate.getGateHome();
File pluginsHome = new File(gateHome, "plugins");
Gate.getCreoleRegister().registerDirectories(new File(pluginsHome, "ANNIE").toURI().toURL());
LanguageAnalyser jape = (LanguageAnalyser)Factory.createResource(
"gate.creole.Transducer", gate.Utils.featureMap(
"grammarURL", "file:///D:/misc_workspace/gate-7.1-build4485-SRC/plugins/ANNIE/resources/NE/SportsCategory.jape","encoding", "UTF-8"));
jape.setCorpus(corpus);
jape.setDocument(doc);
jape.execute();
pipeline = (SerialAnalyserController) Factory.createResource("gate.creole.SerialAnalyserController",
Factory.newFeatureMap(), Factory.newFeatureMap(),"ANNIE");
initAnnie();
pipeline.setCorpus(corpus);
pipeline.add(token);
pipeline.add((ProcessingResource)jape.init());
pipeline.execute();
AnnotationSetImpl ann = (AnnotationSetImpl) doc.getAnnotations();
System.out.println(" ...Total annotation "+ann.getAllTypes());
答案 2 :(得分:0)
如果您想要更新ANNIE管道,这是另一种选择。
示例代码:
File pluginsHome = Gate.getPluginsHome();
File anniePlugin = new File(pluginsHome, "ANNIE");
File annieGapp = new File(anniePlugin, "ANNIE_with_defaults.gapp");
annieController = (CorpusController) PersistenceManager.loadObjectFromFile(annieGapp);
LanguageAnalyser jape = (LanguageAnalyser)Factory.createResource(
"gate.creole.Transducer", gate.Utils.featureMap(
"grammarURL", new URL("file:///C://Program Files//gate-7.1//plugins//ANNIE//resources//NE//opensource.jape"),
"encoding", "UTF-8"));
Collection<ProcessingResource> newPRS = new ArrayList<ProcessingResource>();
Collection<ProcessingResource> prs = annieController.getPRs();
for(ProcessingResource resource: prs){
newPRS.add(resource);
}
newPRS.add((ProcessingResource)jape.init());
annieController.setPRs(newPRS);