Question

我在c ++中模拟一个用错误字段解析的文本格式文件。

我的简单测试.proto文件：

$ cat settings.proto
package settings;
message Settings {
   optional int32  param1 = 1;
   optional string param2 = 2;
   optional bytes  param3 = 3;
}

我的文字格式文件：

$ cat settings.txt
param1: 123
param: "some string"
param3: "another string"

我使用google :: protobuf :: TextFormat :: Parser解析文件：

#include <iostream>
#include <fcntl.h>
#include <unistd.h>
#include <fstream>
#include <google/protobuf/text_format.h>
#include <google/protobuf/io/zero_copy_stream_impl.h>

#include <settings.pb.h>

using namespace std;

int main( int argc, char* argv[] )
{
    GOOGLE_PROTOBUF_VERIFY_VERSION;

    settings::Settings settings;

    int fd = open( argv[1], O_RDONLY );
    if( fd < 0 )
    {
        cerr << " Error opening the file " << endl;
        return false;
    }

    google::protobuf::io::finputStream finput( fd );
    finput.SetCloseOnDelete( true );

    google::protobuf::TextFormat::Parser parser;
    parser.AllowPartialMessage( true );

    if ( !parser.Parce( &finput, &settings ) )
    {
        cerr << "Failed to parse file!" << endl;
    }

    cout << settings.DebugString() << endl;

    google::protobuf::ShutdownProtobufLibrary();

    std::cout << "Exit" << std::endl;
    return true;
}

我为解析器设置了AllowPartialMessage为true。所有字段都是可选的。但是目前Parse在第一个错误字段后停止解析。解析后，“settings”只包含一个第一个字段。

有没有办法通知失败并继续解析另一个正确的字段？

Answer 1

文本格式解析器不允许使用未知字段。文本格式用于与人类进行通信，人类也会进行拼写错误。检测这些拼写错误而不是默默地忽略它们非常重要。

通常，忽略未知字段的原因是为了向前兼容：然后您的程序可以（部分）理解使用新字段针对协议的未来版本编写的消息。我看到了两个特殊用例：

以文本格式进行机器对机器通信的系统。我建议不要这样做。相反，使用二进制格式，或者如果您真的希望机器到机器的通信是文本的，请使用JSON。
人类编写文本格式配置文件然后将其分发给生产中可能旧的服务器的系统。在这种情况下，我建议＆＃34;预编译＆＃34;使用在人类桌面上运行的工具将文本格式的protobuf转换为二进制文件，然后仅将二进制消息发送到生产服务器。本地工具可以很容易地保持最新，并且能够告诉人类用户他们是否拼错了字段名称。

解析文本格式的protobuf消息时如何忽略错误的字段

1 个答案: