如何在ProtobufAnnotationSerializer中获取protobuf扩展字段

时间:2017-07-24 08:56:22

标签: protocol-buffers stanford-nlp

我是协议缓冲区的新手,并试图找出如何在Stanford CoreNLP库中扩展消息类型,如下所述:https://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/pipeline/ProtobufAnnotationSerializer.html

问题:我可以设置扩展字段,但我无法得到它。我将问题归结为下面的代码。在原始邮件中,字段名称为import UIKit class SecondViewController: UIViewController { @IBOutlet weak var labelTe: UILabel! override func viewDidLoad() { super.viewDidLoad() labelTe.text = "something changed" // Do any additional setup after loading the view. } override func didReceiveMemoryWarning() { super.didReceiveMemoryWarning() // Dispose of any resources that can be recreated. } } ,但在反序列化邮件中由字段编号[edu.stanford.nlp.pipeline.myNewField]替换。

我如何获得myNewField的值?

PS:这篇文章https://stackoverflow.com/questions/28815214/how-to-set-get-protobufs-extension-field-in-go表明它应该像调用101一样简单

custom.proto

getExtension(MyAppProtos.myNewField)

ProtoTest.java

syntax = "proto2";

package edu.stanford.nlp.pipeline;

option java_package = "com.example.my.awesome.nlp.app";
option java_outer_classname = "MyAppProtos";

import "CoreNLP.proto";

extend Sentence {
    optional uint32 myNewField = 101;
}

输出:

import com.example.my.awesome.nlp.app.MyAppProtos;
import com.google.protobuf.ExtensionRegistry;
import com.google.protobuf.InvalidProtocolBufferException;

import edu.stanford.nlp.pipeline.CoreNLPProtos;
import edu.stanford.nlp.pipeline.CoreNLPProtos.Sentence;

public class ProtoTest {

    static {
        ExtensionRegistry registry = ExtensionRegistry.newInstance();
        registry.add(MyAppProtos.myNewField);
        CoreNLPProtos.registerAllExtensions(registry);
    }

    public static void main(String[] args) throws InvalidProtocolBufferException {

        Sentence originalSentence = Sentence.newBuilder()
                .setText("Hello world!")
                .setTokenOffsetBegin(0)
                .setTokenOffsetEnd(12)
                .setExtension(MyAppProtos.myNewField, 13)
                .build();

        System.out.println("Original:\n" + originalSentence);

        byte[] serialized = originalSentence.toByteArray();

        Sentence deserializedSentence = Sentence.parseFrom(serialized);
        System.out.println("Deserialized:\n" + deserializedSentence);

        Integer myNewField = deserializedSentence.getExtension(MyAppProtos.myNewField);
        System.out.println("MyNewField: " + myNewField);
    }
}

更新 因为这个问题是关于扩展CoreNLP消息类型并将它们与Original: tokenOffsetBegin: 0 tokenOffsetEnd: 12 text: "Hello world!" [edu.stanford.nlp.pipeline.myNewField]: 13 Deserialized: tokenOffsetBegin: 0 tokenOffsetEnd: 12 text: "Hello world!" 101: 13 MyNewField: 0 一起使用,所以这是我的扩展序列化器的样子:

ProtobufAnnotationSerializer

1 个答案:

答案 0 :(得分:1)

错误在于我没有在扩展注册表中提供parseFrom调用。

Sentence deserializedSentence = Sentence.parseFrom(serialized);更改为Sentence deserializedSentence = Sentence.parseFrom(serialized, registry);完成了这项工作!