我写了一个注释器,它提取所有CD标记的标记,代码看起来像这样
public class WeightAnnotator extends JCasAnnotator_ImplBase {
private Logger logger = Logger.getLogger(getClass().getName());
public static List<Token> weightTokens = new ArrayList<Token>();
public static final String PARAM_STRING = "stringParam";
@ConfigurationParameter(name = PARAM_STRING)
private String stringParam;
@Override
public void initialize(UimaContext context) throws ResourceInitializationException {
super.initialize(context);
}
@Override
public void process(JCas jCas) throws AnalysisEngineProcessException {
logger.info("Starting processing.");
for (Sentence sentence : JCasUtil.select(jCas, Sentence.class)) {
List<Token> tokens = JCasUtil.selectCovered(jCas, Token.class, sentence);
for (Token token : tokens) {
int begin = token.getBegin();
int end = token.getEnd();
if (token.getPos().equals( PARAM_STRING)) {
WeightAnnotation ann = new WeightAnnotation(jCas, begin, end);
ann.addToIndexes();
System.out.println("Token: " + token.getCoveredText());
}
}
}
}
}
但是当我尝试在管道中创建迭代器时,迭代器返回null。这是我的管道的样子。
AnalysisEngineDescription weightDescriptor = AnalysisEngineFactory.createEngineDescription(WeightAnnotator.class,
WeightAnnotator.PARAM_STRING, "CD"
);
AggregateBuilder builder = new AggregateBuilder();
builder.add(sentenceDetectorDescription);
builder.add(tokenDescriptor);
builder.add(posDescriptor);
builder.add(weightDescriptor);
builder.add(writer);
for (JCas jcas : SimplePipeline.iteratePipeline(reader, builder.createAggregateDescription())) {
Iterator iterator1 = JCasUtil.iterator(jcas, WeightAnnotation.class);
while (iterator1.hasNext()) {
WeightAnnotation weights = (WeightAnnotation) iterator1.next();
logger.info("Token: " + weights.getCoveredText());
}
}
我使用JCasGen生成了WeightAnnotator和WeightAnnotator_Type。我调试了整个代码,但我不明白我在哪里弄错了。关于如何改进这一点的任何想法都值得赞赏。