Question

我正在尝试使用uimaFit构建我的数据处理管道，如下所示：

[annotatorA] =＆gt; [Consumer to dump annotatorA's annotations from CAS into DB]

[annotatorB (should take on annotatorA's annotations from DB as input)] =＆GT; [Consumer for annotatorB]

驱动程序代码：

   /* Step 0: Create a reader */
    CollectionReader readerInstance= CollectionReaderFactory.createCollectionReader(
            FilePathReader.class, typeSystem,
            FilePathReader.PARAM_INPUT_FILE,"/path/to/file/to/be/processed");

   /*Step1: Define Annotoator A*/
    AnalysisEngineDescription annotatorAInstance=
           AnalysisEngineFactory.createPrimitiveDescription(
                    annotatorADbConsumer.class, typeSystem, 
                    annotatorADbConsumer.PARAM_DB_URL,"localhost",
                    annotatorADbConsumer.PARAM_DB_NAME,"xyz",
                    annotatorADbConsumer.PARAM_DB_USER_NAME,"name",
                    annotatorADbConsumer.PARAM_DB_USER_PWD,"pw");
    builder.add(annotatorAInstance);

    /* Step2: Define binding for annotatorB to take 
         what-annotator-a put in DB above as input */

    /*Step 3: Define annotator B */
    AnalysisEngineDescription annotatorBInstance =
            AnalysisEngineFactory.createPrimitiveDescription(
                    GateDateTimeLengthAnnotator.class,typeSystem)
    builder.add(annotatorBInstance);

    /*Step 4: Run the pipeline*/
    SimplePipeline.runPipeline(readerInstance, builder.createAggregate());

我的问题是：

上述方法是否正确？
我们如何在步骤2中定义annotatorB在annotatorB中的输出的依赖性？

是否在https://code.google.com/p/uimafit/wiki/ExternalResources#Resource_injection建议了这种方法，实现它的正确方向？

Answer 1

您可以使用@TypeCapability定义依赖关系，如下所示：

@TypeCapability(inputs = { "com.myproject.types.MyType", ... }, outputs = { ... })
public class MyAnnotator extends JCasAnnotator_ImplBase {
    ....
}

请注意，它定义了注释级别的合同，而不是引擎级别（意味着任何引擎都可以创建com.myproject.types.MyType）。

我认为没有办法强制执行。

我确实创建了一些代码来检查Engine在管道的上游是否提供了正确的必需注释，否则会打印错误日志（请参阅Pipeline.checkAndAddCapabilities() and Pipeline.addCapabilities()）。但请注意，只有当所有引擎都定义了TypeCapabilities时才会起作用，当使用外部引擎/库时通常不会这样。

如何在数据库中将CAS定义为uimaFIT中注释器的外部资源？

1 个答案: