即使对于简单实体,ProtoBuf序列化也会丢失数据

时间:2012-12-05 02:43:48

标签: serialization protobuf-net

[更新#1] :如果其他人有兴趣查看基准测试,我已将修改后的“演示”项目上传到https://github.com/sidshetye/SerializersCompare

[更新#2] :我发现ProtoBufs仅在后续迭代中占据了数量级的优势。对于一次性序列化,BinaryFormatter是一个快一个数量级的序列化。为什么?单独的问题......

我正在尝试比较BinaryFormatter,Json.NET和ProtoBuf.NET(今天让NuGet得到了后者)。我发现ProtoBuf没有输出实数字段,所有空值和0(见下文)。加上BinaryFormatter似乎更快。我基本上序列化了=>反序列化对象并进行比较

  • 带有再生对象的原件
  • 以字节为单位的大小
  • ms的时间

问题

  1. 如何让ProtoBuf实际吐出真实值而不只是(默认?)值?
  2. 我对速度的错误是什么?我虽然ProtoBuf应该是最快的序列化器?
  3. 我从测试应用程序获得的输出如下:

    Json: Objects identical
    Json in UTF-8: 180 bytes, 249.7054 ms
    
    BinaryFormatter: Objects identical
    BinaryFormatter: 512 bytes, 1.7864 ms
    
    ProtoBuf: Original and regenerated objects differ !!
    ====Regenerated Object====
    {
        "functionCall": null,
        "parameters": null,
        "name": null,
        "employeeId": 0,
        "raiseRate": 0.0,
        "addressLine1": null,
        "addressLine2": null
    }
    ProtoBuf: 256 bytes, 117.969 ms
    

    我的测试是在控制台应用程序中使用一个简单的实体(见下文)。系统:Windows 8x64,VS2012 Update 1,.NET4.5。顺便说一下,我使用[ProtoContract][ProtoMember(X)]约定得到了相同的结果。文档不清楚但出现 DataContract是更新的“统一”支持约定(对吗?)

    [Serializable]
    [DataContract]
    class SimpleEntity
    {
        [DataMember(Order = 1)]
        public string functionCall {get;set;}
    
        [DataMember(Order = 2)]
        public string parameters { get; set; }
    
        [DataMember(Order = 3)]
        public string name { get; set; }
    
        [DataMember(Order = 4)]
        public int employeeId { get; set; }
    
        [DataMember(Order = 5)]
        public float raiseRate { get; set; }
    
        [DataMember(Order = 6)]
        public string addressLine1 { get; set; }
    
        [DataMember(Order = 7)]
        public string addressLine2 { get; set; }
    
        public SimpleEntity()
        {
        }
    
        public void FillDummyData()
        {
            functionCall = "FunctionNameHere";
            parameters = "x=1,y=2,z=3";
    
            name = "Mickey Mouse";
            employeeId = 1;
            raiseRate = 1.2F;
            addressLine1 = "1 Disney Street";
            addressLine2 = "Disneyland, CA";
        }
    }
    

    对于那些感兴趣的人是我的ProtoBufs的AllSerializers类的片段

    public byte[] SerProtoBuf(object thisObj)
    {
        using (MemoryStream ms = new MemoryStream())
        {
            Serializer.Serialize(ms, thisObj);
            return ms.GetBuffer();
        }
    }
    
    public T DeserProtoBuf<T>(byte[] bytes)
    {
    
        using (MemoryStream ms = new MemoryStream())
        {
            ms.Read(bytes, 0, bytes.Count());
            return Serializer.Deserialize<T>(ms);
        }
    }
    

1 个答案:

答案 0 :(得分:2)

首先,你的序列化/反序列化方法都被破坏了;您过度报告结果(GetBuffer(),没有Length),并且您没有在流中编写任何内容以进行反序列化。这是一个正确的实现(尽管如果您返回GetBuffer(),也可以使用ArraySegment<byte>):

public static byte[] SerProtoBuf(object thisObj)
{
    using (MemoryStream ms = new MemoryStream())
    {
        Serializer.NonGeneric.Serialize(ms, thisObj);
        return ms.ToArray();
    }
}

public static T DeserProtoBuf<T>(byte[] bytes)
{
    using (MemoryStream ms = new MemoryStream(bytes))
    {
        return Serializer.Deserialize<T>(ms);
    }
}

这就是为什么你没有得到任何数据。其次,你没有说你是如何计时的,所以这里有一些我根据你的代码编写的(其中还包括代码来表明它正在获取所有的值)。结果第一:

Via BinaryFormatter:
1 Disney Street
Disneyland, CA
1
FunctionNameHere
Mickey Mouse
x=1,y=2,z=3
1.2

Via protobuf-net:
1 Disney Street
Disneyland, CA
1
FunctionNameHere
Mickey Mouse
x=1,y=2,z=3
1.2

Serialize BinaryFormatter: 112 ms, 434 bytes
Deserialize BinaryFormatter: 113 ms
Serialize protobuf-net: 14 ms, 85 bytes
Deserialize protobuf-net: 19 ms

分析:

两个序列化程序都存储相同的数据; protobuf-net的速度提高了一个数量级,输出功率也减少了5倍。我宣布:胜利者。

代码:

static BinaryFormatter bf = new BinaryFormatter();
public static byte[] SerBinaryFormatter(object thisObj)
{
    using (MemoryStream ms = new MemoryStream())
    {
        bf.Serialize(ms, thisObj);
        return ms.ToArray();
    }
}

public static T DeserBinaryFormatter<T>(byte[] bytes)
{
    using (MemoryStream ms = new MemoryStream(bytes))
    {
        return (T)bf.Deserialize(ms);
    }
}
static void Main()
{
    SimpleEntity obj = new SimpleEntity(), clone;
    obj.FillDummyData();

    // test that we get non-zero bytes
    var data = SerBinaryFormatter(obj);
    clone = DeserBinaryFormatter<SimpleEntity>(data);
    Console.WriteLine("Via BinaryFormatter:");
    Console.WriteLine(clone.addressLine1);
    Console.WriteLine(clone.addressLine2);
    Console.WriteLine(clone.employeeId);
    Console.WriteLine(clone.functionCall);
    Console.WriteLine(clone.name);
    Console.WriteLine(clone.parameters);
    Console.WriteLine(clone.raiseRate);
    Console.WriteLine();

    data = SerProtoBuf(obj);
    clone = DeserProtoBuf<SimpleEntity>(data);
    Console.WriteLine("Via protobuf-net:");
    Console.WriteLine(clone.addressLine1);
    Console.WriteLine(clone.addressLine2);
    Console.WriteLine(clone.employeeId);
    Console.WriteLine(clone.functionCall);
    Console.WriteLine(clone.name);
    Console.WriteLine(clone.parameters);
    Console.WriteLine(clone.raiseRate);
    Console.WriteLine();

    Stopwatch watch = new Stopwatch();
    const int LOOP = 10000;

    watch.Reset();
    watch.Start();
    for (int i = 0; i < LOOP; i++)
    {
        data = SerBinaryFormatter(obj);
    }
    watch.Stop();
    Console.WriteLine("Serialize BinaryFormatter: {0} ms, {1} bytes", watch.ElapsedMilliseconds, data.Length);

    watch.Reset();
    watch.Start();
    for (int i = 0; i < LOOP; i++)
    {
        clone = DeserBinaryFormatter<SimpleEntity>(data);
    }
    watch.Stop();
    Console.WriteLine("Deserialize BinaryFormatter: {0} ms", watch.ElapsedMilliseconds, data.Length);

    watch.Reset();
    watch.Start();
    for (int i = 0; i < LOOP; i++)
    {
        data = SerProtoBuf(obj);
    }
    watch.Stop();
    Console.WriteLine("Serialize protobuf-net: {0} ms, {1} bytes", watch.ElapsedMilliseconds, data.Length);

    watch.Reset();
    watch.Start();
    for (int i = 0; i < LOOP; i++)
    {
        clone = DeserProtoBuf<SimpleEntity>(data);
    }
    watch.Stop();
    Console.WriteLine("Deserialize protobuf-net: {0} ms", watch.ElapsedMilliseconds, data.Length);
}

最后,[DataMember(...)]支持并不是真正的“更新'统一'支持惯例” - 它肯定不是“更新” - 我很确定它已经支持这两者,因为类似于提交# 4(可能更早)。它只是为方便起见而提供的选项:

  • 并非所有目标平台都有DataMemberAttribute
  • 有些人喜欢将DTO图层限制为内置标记
  • 某些类型在很大程度上超出了您的控制范围,但可能已经有了这些标记(例如,从LINQ-to-SQL生成的数据)
  • 另外,请注意2.x允许您在运行时定义模型而无需添加属性(尽管属性仍然是方便的方式)