Java名称实体识别返回名称和姓氏?

时间:2014-03-26 13:09:41

标签: java opennlp

我是使用名称实体识别的新手,所以我找到了一个关于它的教程。我正在使用OpenNLP的ner。本教程可以识别全名(名称,姓氏)和名称。是否有可能让它返回全名以及如何?

这是我到目前为止一直在处理的代码

String[] sentences = {
        "Former first lady Nancy Reagan was taken to a " +
                "suburban Los Angeles " +
                "hospital as a precaution Sunday after a " +
        "fall at her home, an " +
        "aide said. ",
        "The 86-year-old Reagan will remain overnight for " +
                "observation at a hospital in Santa Monica, California, " +
                "said Joanne " +
                "Drake, chief of staff for the Reagan Foundation."};
NameFinderME finder = new NameFinderME(
        new TokenNameFinderModel(new FileInputStream("utilities/en-ner-person.bin"))
);
Tokenizer tokenizer = SimpleTokenizer.INSTANCE;
    for (int si = 0; si < sentences.length; si++) {
        Span[] tokenSpans = tokenizer.tokenizePos(sentences[si]);
        String[] tokens = Span.spansToStrings(tokenSpans, sentences[si]);
        Span[] names = finder.find(tokens);
        for (int ni = 0; ni < names.length; ni++) {
            Span startSpan = tokenSpans[names[ni].getStart()];
            int nameStart = startSpan.getStart();
            Span endSpan = tokenSpans[names[ni].getEnd() - 1];
            int nameEnd = endSpan.getEnd();
            String name = sentences[si].substring(nameStart, nameEnd);
            System.out.println(name);
        }
    }

我得到的结果是     南希里根     里根     乔安妮德雷克     里根

那我怎么能让它只回归南希里根和乔安妮德雷克?可能吗?

0 个答案:

没有答案