创建带注释的CoreDocument后,要将其保存到磁盘中,以后再检索。
计算带注释的CoreDocument很慢。创建之后,曾经想在以后使用它,即从磁盘检索它。
// tslint:disable-next-line:no-output-rename
@Output('bpCreated') blueprintCreated
答案 0 :(得分:0)
您应该查看AnnotationSerializer类:
https://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/pipeline/AnnotationSerializer.html
具体来说,尽管此类有多个实例化,但我们主要使用了ProtobufAnnotationSerializer
。
您可以看到一些集成测试中的使用示例。 ProtobufSerializationSanityITest
是如何使用它的简单示例。 ProtobufAnnotationSerializerSlowITest
是一个更为详尽但复杂的示例。您可以在Github repository中找到它们。
答案 1 :(得分:0)
Thanks for the help, as I'm new to the stanford npl. The AnnotationSerialize class
moved me forward in saving the document to disk. I had a further misunderstanding
about interpreting the result. I didn't realize that the result (pair.first)
contained the full result. The pertinent code is:
public void writeDoc(CoreDocument document, String filename ) {
AnnotationSerializer serializer = new ProtobufAnnotationSerializer();
FileOutputStream fos = null;
try {
OutputStream ks = new FileOutputStream(filename);
ks = serializer.writeCoreDocument(document, ks);
ks.flush();
ks.close();
}catch(IOException ioex) {
logger.error("IOException "+ioex);
}
}
public void ReadSavedDoc(String filename) {
try {
File initialFile = new File(filename);
InputStream ks = new FileInputStream(initialFile);
// Read
AnnotationSerializer serializer = new ProtobufAnnotationSerializer();
InputStream kis = new ByteArrayInputStream(ks.readAllBytes());
Pair<Annotation, InputStream> pair = serializer.read(kis);
pair.second.close();
Annotation readAnnotation = pair.first;
kis.close();
//Output
List<CoreLabel> newTokens =
readAnnotation.get(CoreAnnotations.TokensAnnotation.class);
for(CoreLabel atoken: newTokens)
System.out.println("atoken "+atoken);
List<CoreMap> newSentences =
readAnnotation.get(CoreAnnotations.SentencesAnnotation.class);
logger.info("Sentences "+newSentences);
String newEntity =
readAnnotation.get(CoreAnnotations.NamedEntityTagAnnotation.class);
System.out.println("named entity "+newEntity);
String newPOS =
readAnnotation.get(CoreAnnotations.PartOfSpeechAnnotation.class);
logger.info("pos "+newPOS);
for(CoreMap sentence : newSentences){
System.out.println(sentence);
}
} catch (IOException e) {
e.printStackTrace();
} catch (ClassNotFoundException e) {
e.printStackTrace();
} catch (ClassCastException e) {
e.printStackTrace();
} catch(Exception ex) {
logger.error("Exception: "+ex);
ex.printStackTrace();
}
}
Hope this helps someone else. Don