Question

我想使用openNLP MaxEnt编写自己的模型，因为我想实现ContextGenerator和EventStream接口（如文档中所述）。我查看了openNLP Chuncker，POSTagger和NameFinder的这些实现，但所有这些实现都使用了＆＃39; Pair＆＃39;这是不推荐使用的，只是查看代码，我不了解他们各自的ContextGenerators正在做什么。我将创建的模型将通过查看每个标记的POS标记将每个标记分类为RoomNumber或不是RoomNumber。我该如何开始为此模型编写ContextGenerator和EventStream。我知道上下文是什么以及功能是什么，但我不知道ContextGenerator做什么以及EvenStream做什么。我确实看过openNLP maxent页面，但没有用。请帮助我理解这一点，谢谢。

Answer 1

以下代码可能有所帮助，但它没有明确使用ContextGenerator。实际上，BasicContextGenerator在BasicEventStream中使用，它只是将每个输入字符串拆分为一系列要素。

e.g。字符串"a=1 b=2 c=1"分为3个功能："a=1"，"b=2"和"c=1"。

如果您只想使用Maxent API训练模型然后将其用于分类，您可以使用以下方法对我有用：

package opennlptest;

import java.io.IOException;
import java.util.Arrays;
import java.util.List;

import opennlp.maxent.GIS;
import opennlp.model.Event;
import opennlp.model.EventStream;
import opennlp.model.ListEventStream;
import opennlp.model.MaxentModel;

public class TestMaxentEvents {

    static Event createEvent(String outcome, String... context) {
        return new Event(outcome, context);
    }

    public static void main(String[] args) throws IOException {

        // here are the input training samples
        List<Event> samples =  Arrays.asList(new Event[] {
                //           outcome + context
                createEvent("c=1", "a=1", "b=1"),
                createEvent("c=1", "a=1", "b=0"),
                createEvent("c=0", "a=0", "b=1"),
                createEvent("c=0", "a=0", "b=0")
        });

        // training the model
        EventStream stream = new ListEventStream(samples);
        MaxentModel model = GIS.trainModel(stream);

        // using the trained model to predict the outcome given the context
        String[] context = {"a=1", "b=0"};
        double[] outcomeProbs = model.eval(context);
        String outcome = model.getBestOutcome(outcomeProbs);

        System.out.println(outcome); // output: c=1
    }

}

OpenNLP MaxEnt - ContextGenerator和EventStream

1 个答案: