Question

我试图在他们的网站上使用Stanford tokenizer和以下示例：

import java.io.FileReader;
import java.io.IOException;
import java.util.List;

import edu.stanford.nlp.ling.CoreLabel;
import edu.stanford.nlp.ling.HasWord;
import edu.stanford.nlp.process.CoreLabelTokenFactory;
import edu.stanford.nlp.process.DocumentPreprocessor;
import edu.stanford.nlp.process.PTBTokenizer;

public class TokenizerDemo {

  public static void main(String[] args) throws IOException {
    for (String arg : args) {
      // option #1: By sentence.
      DocumentPreprocessor dp = new DocumentPreprocessor(arg);
      for (List sentence : dp) {
        System.out.println(sentence);
      }
      // option #2: By token
      PTBTokenizer ptbt = new PTBTokenizer(new FileReader(arg),
              new CoreLabelTokenFactory(), "");
      for (CoreLabel label; ptbt.hasNext(); ) {
        label = ptbt.next();
        System.out.println(label);
      }
    }
  }
}

我尝试编译时遇到以下错误：

TokenizerDemo.java:24: error: incompatible types: Object cannot be converted to CoreLabel
        label = ptbt.next();

有谁知道原因可能是什么？如果您感兴趣，我使用Java 1.8并确保CLASSPATH包含jar文件。

Answer 1

尝试参数化PTBTokenizer类。例如：

PTBTokenizer<CoreLabel> ptbt = new PTBTokenizer<>(new FileReader(arg),
          new CoreLabelTokenFactory(), "");

不兼容的类型：对象无法转换为CoreLabel

1 个答案: