StanfordCoreNLPClient在情绪分析中无法正常工作

时间:2018-05-01 20:28:04

标签: stanford-nlp

Stanford CoreNLP版本3.9.1

在进行情绪分析时,StanfordCoreNLPClient的工作方式与StanfordCoreNLP的工作方式相同。

public class Test {

  public static void main(String[] args) {
    String text = "This server doesn't work!";

    Properties props = new Properties();
    props.setProperty("annotators", "tokenize, ssplit, pos, lemma, ner, parse, sentiment");

    //If I uncomment this line, and comment out the next one, it works                            
    //StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
    StanfordCoreNLPClient pipeline = new StanfordCoreNLPClient(props, "http://localhost", 9000, 2);

    Annotation annotation = new Annotation(text);
    pipeline.annotate(annotation);
    CoreDocument document = new CoreDocument(annotation);           
    CoreSentence sentence = document.sentences().get(0);

    //outputs null when using StanfordCoreNLPClient
    System.out.println(RNNCoreAnnotations.getPredictions(sentence.sentimentTree())); 

    //throws null pointer when using StanfordCoreNLPClien (reason of course is that it uses the same method I called above, I assume)   
    System.out.println(RNNCoreAnnotations.getPredictionsAsStringList(sentence.sentimentTree()));  

}

}

使用StanfordCoreNLPClient pipeline = new StanfordCoreNLPClient(props, "http://localhost", 9000, 2)输出:

 null
 Exception in thread "main" java.lang.NullPointerException
at edu.stanford.nlp.neural.rnn.RNNCoreAnnotations.getPredictionsAsStringList(RNNCoreAnnotations.java:68)
at tomkri.mastersentimentanalysis.preprocessing.Test.main(Test.java:35)

使用StanfordCoreNLP pipeline = new StanfordCoreNLP(props)输出:

     Type = dense , numRows = 5 , numCols = 1
     0.127  
     0.599  
     0.221  
     0.038  
     0.015  

     [0.12680336652661395, 0.5988695516384742, 0.22125584263055106, 0.03843574738131668, 0.014635491823044227]

在这两种情况下,情绪都有其他注释(至少是我尝试过的那些)。

服务器启动正常,我可以在我的网络浏览器中使用。当在那里使用它时,我还以json格式输出情绪分数(在解析中的每个子树上)。

1 个答案:

答案 0 :(得分:0)

我的解决方案,万一其他人需要它。

我尝试通过JSON响应向服务器发出http请求来获取所需的注释:

HttpResponse<JsonNode> jsonResponse = Unirest.post("http://localhost:9000")
       .queryString("properties", "{\"annotators\":\"tokenize, ssplit, pos, lemma, ner, parse, sentiment\",\"outputFormat\":\"json\"}")
       .body(text)
       .asJson();

String sentTreeStr = jsonResponse.getBody().getObject().
                getJSONArray("sentences").getJSONObject(0).getString("sentimentTree");

System.out.println(sentTreeStr); //prints out sentiment values for tree and all sub trees.

但并非所有注释数据都可用。例如,您无法获得所有可能的概率分布 情绪值,只有最有可能的情绪概率(概率最高的情绪)。

如果您需要,这是一个解决方案:

HttpResponse<InputStream> inStream = Unirest.post("http://localhost:9000")
        .queryString(
                "properties", 
                "{\"annotators\":\"tokenize, ssplit, pos, lemma, ner, parse, sentiment\","
                + "\"outputFormat\":\"serialized\","
                + "\"serializer\": \"edu.stanford.nlp.pipeline.GenericAnnotationSerializer\"}"
        )
        .body(text)
        .asBinary();

GenericAnnotationSerializer  serializer = new GenericAnnotationSerializer ();
try{
        ObjectInputStream in = new ObjectInputStream(inStream.getBody());
        Pair<Annotation, InputStream> deserialized = serializer.read(in);
        Annotation annotation = deserialized.first();           

        //And now we are back to a state as if we were not running CoreNLP as server.
        CoreDocument doc = new CoreDocument(annotation);         
        CoreSentence sentence = document.sentences().get(0);
        //Prints out same output as shown in question  
        System.out.println(
            RNNCoreAnnotations.getPredictions(sentence.sentimentTree()));

} catch (UnirestException ex) {
       Logger.getLogger(SentimentTargetExtractor.class.getName()).log(Level.SEVERE, null, ex);
   }