Question

我打算使用Corenlp来注释一些亚马逊评论，但是，我等了6个多小时，没有产生任何输出。

 1. the review is about 1MB;
 2. the cluster has 12CPU, 64G memory;
 3. the command is 
 java -cp "*" -Xmx64g edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,ner,sentiment -outputFormat json  -file amazon_apple_comments_4.csv

发生了什么事？这么慢吗？

Answer 1

对于1Mb文档来说，这太慢了。尝试运行较少的注释器以缩小哪一个占用最多的时间。 tokenize和ssplit注释器应该非常快; pos有点慢，但并不坏; ner比pos慢，但在1Mb亚马逊评论中，它不应该找到很多命名实体。我从未使用sentiment，但我认为这是非常重要的。

CoreNLP运行速度太慢

1 个答案: