我编写了一个程序,使用XMLSerializer,BinaryFormatter和ProtoBuf序列化'Person'类。我认为protobuf-net应该比其他两个更快。 Protobuf序列化比XMLSerialization更快,但比二进制序列化慢得多。我的理解不正确吗?请让我明白这一点。谢谢你的帮助。
编辑: - 我更改了代码(下面更新),仅测量序列化的时间,而不是创建流,但仍然看到差异。可以告诉我为什么吗?
以下是输出: -
人在347毫秒内使用协议缓冲区创建
人员是在1462毫秒内使用XML创建的
人在2毫秒内使用二进制创建
以下代码
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using ProtoBuf;
using System.IO;
using System.Diagnostics;
using System.Runtime.Serialization.Formatters.Binary;
namespace ProtocolBuffers
{
class Program
{
static void Main(string[] args)
{
string folderPath = @"E:\Ashish\Research\VS Solutions\ProtocolBuffers\ProtocolBuffer1\bin\Debug";
string XMLSerializedFileName = Path.Combine(folderPath,"PersonXMLSerialized.xml");
string ProtocolBufferFileName = Path.Combine(folderPath,"PersonProtocalBuffer.bin");
string BinarySerializedFileName = Path.Combine(folderPath,"PersonBinary.bin");
if (File.Exists(XMLSerializedFileName))
{
File.Delete(XMLSerializedFileName);
Console.WriteLine(XMLSerializedFileName + " deleted");
}
if (File.Exists(ProtocolBufferFileName))
{
File.Delete(ProtocolBufferFileName);
Console.WriteLine(ProtocolBufferFileName + " deleted");
}
if (File.Exists(BinarySerializedFileName))
{
File.Delete(BinarySerializedFileName);
Console.WriteLine(BinarySerializedFileName + " deleted");
}
var person = new Person
{
Id = 12345,
Name = "Fred",
Address = new Address
{
Line1 = "Flat 1",
Line2 = "The Meadows"
}
};
Stopwatch watch = Stopwatch.StartNew();
using (var file = File.Create(ProtocolBufferFileName))
{
watch.Start();
Serializer.Serialize(file, person);
watch.Stop();
}
//Console.WriteLine(watch.ElapsedMilliseconds.ToString());
Console.WriteLine("Person got created using protocol buffer in " + watch.ElapsedMilliseconds.ToString() + " milliseconds ");
watch.Reset();
System.Xml.Serialization.XmlSerializer x = new System.Xml.Serialization.XmlSerializer(person.GetType());
using (TextWriter w = new StreamWriter(XMLSerializedFileName))
{
watch.Start();
x.Serialize(w, person);
watch.Stop();
}
//Console.WriteLine(watch.ElapsedMilliseconds.ToString());
Console.WriteLine("Person got created using XML in " + watch.ElapsedMilliseconds.ToString() + " milliseconds");
watch.Reset();
using (Stream stream = File.Open(BinarySerializedFileName, FileMode.Create))
{
BinaryFormatter bformatter = new BinaryFormatter();
//Console.WriteLine("Writing Employee Information");
watch.Start();
bformatter.Serialize(stream, person);
watch.Stop();
}
//Console.WriteLine(watch.ElapsedMilliseconds.ToString());
Console.WriteLine("Person got created using binary in " + watch.ElapsedMilliseconds.ToString() + " milliseconds");
Console.ReadLine();
}
}
[ProtoContract]
[Serializable]
public class Person
{
[ProtoMember(1)]
public int Id { get; set; }
[ProtoMember(2)]
public string Name { get; set; }
[ProtoMember(3)]
public Address Address { get; set; }
}
[ProtoContract]
[Serializable]
public class Address
{
[ProtoMember(1)]
public string Line1 { get; set; }
[ProtoMember(2)]
public string Line2 { get; set; }
}
}
答案 0 :(得分:25)
我回复了你的电子邮件;我没有意识到你也在这里发布了它。我的第一个问题是:protobuf-net的哪个版本?我问的原因是“v2”的开发主干故意禁用自动编译,因此我可以使用单元测试来测试运行时和预编译版本。因此,如果您使用“v2”(仅在源代码中可用),则需要告诉它编译模型 - 否则它将运行100%反射。
在“v1”或“v2”中,您可以执行以下操作:
Serializer.PrepareSerializer<Person>();
完成此操作后,我得到的数字(来自您电子邮件中的代码;我没有检查上面是否是相同的样本):
10
Person got created using protocol buffer in 10 milliseconds
197
Person got created using XML in 197 milliseconds
3
Person got created using binary in 3 milliseconds
另一个因素是重复; 3-10毫秒坦率地说没有;你无法比较这个级别的数字。将其重复5000次(重新使用XmlSerializer
/ BinaryFormatter
个实例;不会引入任何虚假成本)我得到:
110
Person got created using protocol buffer in 110 milliseconds
329
Person got created using XML in 329 milliseconds
133
Person got created using binary in 133 milliseconds
将此视为极端(100000):
1544
Person got created using protocol buffer in 1544 milliseconds
3009
Person got created using XML in 3009 milliseconds
3087
Person got created using binary in 3087 milliseconds
最终:
另请注意,在“v2”中,已编译的模型可以完全静态编译(到可以部署的dll),甚至可以消除(已经很小的)旋转成本。
答案 1 :(得分:5)
我的观点与标记的答案略有不同。我认为这些测试中的数字反映了二进制格式化程序的元数据开销。 BinaryFormatter在写入数据之前首先写入关于类的元数据,而protobuf只写入数据。
对于测试中非常小的对象(一个Person对象),二进制格式化程序的元数据成本比实际情况更重,因为它编写的元数据多于数据。因此,当您增加重复次数时,元数据成本会被夸大,在极端情况下会达到与xml序列化相同的水平。
如果序列化Person数组,并且数组足够大,则元数据成本对总成本来说是微不足道的。然后二进制格式化程序应该执行类似于protobuf的极端重复测试。
PS:我找到了这个页面,因为我正在评估不同的序列化器。我还发现了一个博客http://blogs.msdn.com/b/youssefm/archive/2009/07/10/comparing-the-performance-of-net-serializers.aspx,它显示了DataContractSerializer +二进制XmlDictionaryWriter比二进制格式化程序执行好几倍的测试结果。它还测试了非常小的数据。当我使用大量数据进行测试时,我惊讶地发现结果非常不同。因此,请使用您实际使用的实际数据进行测试。
答案 2 :(得分:4)
我们不断地序列化相当大的对象(大约50个属性),所以我写了一个小测试来比较BinaryFormatter和protobuf-net,就像你做的那样,这里是我的结果(10000个对象):
BinaryFormatter serialize: 316
BinaryFormatter deserialize: 279
protobuf serialize: 243
protobuf deserialize: 139
BinaryFormatter serialize: 315
BinaryFormatter deserialize: 281
protobuf serialize: 127
protobuf deserialize: 110
这显然是一个非常显着的差异。它在第二次运行时(测试完全相同)也比在第一次运行时快得多。
更新。执行RuntimeTypeModel.Add..Compile会产生以下结果:
BinaryFormatter serialize: 303
BinaryFormatter deserialize: 282
protobuf serialize: 113
protobuf deserialize: 50
BinaryFormatter serialize: 317
BinaryFormatter deserialize: 266
protobuf serialize: 126
protobuf deserialize: 49
答案 3 :(得分:0)
如果我们在内存中进行比较,硬编码序列化在某些情况下会更快。 如果你的课很简单,也许会更好地编写你自己的序列化器......
略有修改的代码:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using ProtoBuf;
using System.IO;
using System.Diagnostics;
using System.Runtime.Serialization.Formatters.Binary;
namespace ProtocolBuffers
{
class Program
{
static void Main(string[] args)
{
string folderPath = @"../Debug";
string XMLSerializedFileName = Path.Combine(folderPath, "PersonXMLSerialized.xml");
string ProtocolBufferFileName = Path.Combine(folderPath, "PersonProtocalBuffer.bin");
string BinarySerializedFileName = Path.Combine(folderPath, "PersonBinary.bin");
string BinarySerialized2FileName = Path.Combine(folderPath, "PersonBinary2.bin");
if (File.Exists(XMLSerializedFileName))
{
File.Delete(XMLSerializedFileName);
Console.WriteLine(XMLSerializedFileName + " deleted");
}
if (File.Exists(ProtocolBufferFileName))
{
File.Delete(ProtocolBufferFileName);
Console.WriteLine(ProtocolBufferFileName + " deleted");
}
if (File.Exists(BinarySerializedFileName))
{
File.Delete(BinarySerializedFileName);
Console.WriteLine(BinarySerializedFileName + " deleted");
}
if (File.Exists(BinarySerialized2FileName))
{
File.Delete(BinarySerialized2FileName);
Console.WriteLine(BinarySerialized2FileName + " deleted");
}
var person = new Person
{
Id = 12345,
Name = "Fred",
Address = new Address
{
Line1 = "Flat 1",
Line2 = "The Meadows"
}
};
Stopwatch watch = Stopwatch.StartNew();
using (var file = new MemoryStream())
//using (var file = File.Create(ProtocolBufferFileName))
{
watch.Start();
for (int i = 0; i < 100000; i++)
Serializer.Serialize(file, person);
watch.Stop();
}
Console.WriteLine("Person got created using protocol buffer in " + watch.ElapsedMilliseconds.ToString() + " milliseconds ");
watch.Reset();
System.Xml.Serialization.XmlSerializer x = new System.Xml.Serialization.XmlSerializer(person.GetType());
using (var w = new MemoryStream())
//using (TextWriter w = new StreamWriter(XMLSerializedFileName))
{
watch.Start();
for (int i = 0; i < 100000; i++)
x.Serialize(w, person);
watch.Stop();
}
Console.WriteLine("Person got created using XML in " + watch.ElapsedMilliseconds.ToString() + " milliseconds");
watch.Reset();
using (var stream = new MemoryStream())
//using (Stream stream = File.Open(BinarySerializedFileName, FileMode.Create))
{
BinaryFormatter bformatter = new BinaryFormatter();
watch.Start();
for (int i = 0; i < 100000; i++)
bformatter.Serialize(stream, person);
watch.Stop();
}
Console.WriteLine("Person got created using binary in " + watch.ElapsedMilliseconds.ToString() + " milliseconds");
watch.Reset();
using (var stream = new MemoryStream())
//using (Stream stream = File.Open(BinarySerialized2FileName, FileMode.Create))
{
BinaryWriter writer = new BinaryWriter(stream);
watch.Start();
for (int i = 0; i < 100000; i++)
writer.Write(person.GetBytes());
watch.Stop();
}
Console.WriteLine("Person got created using binary2 in " + watch.ElapsedMilliseconds.ToString() + " milliseconds");
Console.ReadLine();
}
}
[ProtoContract]
[Serializable]
public class Person
{
[ProtoMember(1)]
public int Id { get; set; }
[ProtoMember(2)]
public string Name { get; set; }
[ProtoMember(3)]
public Address Address { get; set; }
public byte[] GetBytes()
{
using (var stream = new MemoryStream())
{
BinaryWriter writer = new BinaryWriter(stream);
writer.Write(this.Id);
writer.Write(this.Name);
writer.Write(this.Address.GetBytes());
return stream.ToArray();
}
}
public Person()
{
}
public Person(byte[] bytes)
{
using (var stream = new MemoryStream(bytes))
{
BinaryReader reader = new BinaryReader(stream);
Id = reader.ReadInt32();
Name = reader.ReadString();
int bytesForAddressLenght = (int)(stream.Length - stream.Position);
byte[] bytesForAddress = new byte[bytesForAddressLenght];
Array.Copy(bytes, (int)stream.Position, bytesForAddress, 0, bytesForAddressLenght);
Address = new Address(bytesForAddress);
}
}
}
[ProtoContract]
[Serializable]
public class Address
{
[ProtoMember(1)]
public string Line1 { get; set; }
[ProtoMember(2)]
public string Line2 { get; set; }
public byte[] GetBytes()
{
using(var stream = new MemoryStream())
{
BinaryWriter writer = new BinaryWriter(stream);
writer.Write(this.Line1);
writer.Write(this.Line2);
return stream.ToArray();
}
}
public Address()
{
}
public Address(byte[] bytes)
{
using(var stream = new MemoryStream(bytes))
{
BinaryReader reader = new BinaryReader(stream);
Line1 = reader.ReadString();
Line2 = reader.ReadString();
}
}
}
}
和我的结果:
Person got created using protocol buffer in 141 milliseconds
Person got created using XML in 676 milliseconds
Person got created using binary in 525 milliseconds
Person got created using binary2 in 79 milliseconds