我正在尝试使用uimaFit构建我的数据处理管道,如下所示:
[annotatorA]
=> [Consumer to dump annotatorA's annotations from CAS into DB]
[annotatorB (should take on annotatorA's annotations from DB as input)]
=> [Consumer for annotatorB]
驱动程序代码:
/* Step 0: Create a reader */
CollectionReader readerInstance= CollectionReaderFactory.createCollectionReader(
FilePathReader.class, typeSystem,
FilePathReader.PARAM_INPUT_FILE,"/path/to/file/to/be/processed");
/*Step1: Define Annotoator A*/
AnalysisEngineDescription annotatorAInstance=
AnalysisEngineFactory.createPrimitiveDescription(
annotatorADbConsumer.class, typeSystem,
annotatorADbConsumer.PARAM_DB_URL,"localhost",
annotatorADbConsumer.PARAM_DB_NAME,"xyz",
annotatorADbConsumer.PARAM_DB_USER_NAME,"name",
annotatorADbConsumer.PARAM_DB_USER_PWD,"pw");
builder.add(annotatorAInstance);
/* Step2: Define binding for annotatorB to take
what-annotator-a put in DB above as input */
/*Step 3: Define annotator B */
AnalysisEngineDescription annotatorBInstance =
AnalysisEngineFactory.createPrimitiveDescription(
GateDateTimeLengthAnnotator.class,typeSystem)
builder.add(annotatorBInstance);
/*Step 4: Run the pipeline*/
SimplePipeline.runPipeline(readerInstance, builder.createAggregate());
我的问题是:
是否在https://code.google.com/p/uimafit/wiki/ExternalResources#Resource_injection建议了这种方法 ,实现它的正确方向?
答案 0 :(得分:1)
您可以使用@TypeCapability
定义依赖关系,如下所示:
@TypeCapability(inputs = { "com.myproject.types.MyType", ... }, outputs = { ... })
public class MyAnnotator extends JCasAnnotator_ImplBase {
....
}
请注意,它定义了注释级别的合同,而不是引擎级别(意味着任何引擎都可以创建com.myproject.types.MyType
)。
我认为没有办法强制执行。
我确实创建了一些代码来检查Engine在管道的上游是否提供了正确的必需注释,否则会打印错误日志(请参阅Pipeline.checkAndAddCapabilities() and Pipeline.addCapabilities())。但请注意,只有当所有引擎都定义了TypeCapabilities时才会起作用,当使用外部引擎/库时通常不会这样。