我试图反序列化由特定网站上的脚本发布的消息。我查看了脚本并注意到它使用了protobufjs。消息结构从服务器的JSON文件加载,该文件看起来像这个字符串化:
{
"nested":{
"InteractionCollection":{
"fields":{
"interactions":{
"rule":"repeated",
"type":"Interaction",
"id":1
},
"mouseMovements":{
"rule":"repeated",
"type":"MouseMovement",
"id":2
},
"url":{
"type":"string",
"id":3
},
"flags":{
"rule":"repeated",
"type":"Flag",
"id":4
}
},
"nested":{
"Interaction":{
"fields":{
"type":{
"type":"string",
"id":1
},
"time":{
"type":"int64",
"id":2
},
"elementId":{
"type":"string",
"id":3
},
"elementType":{
"type":"string",
"id":4
},
"additionalInfo":{
"type":"string",
"id":5
}
}
},
"MouseMovement":{
"fields":{
"time":{
"type":"int64",
"id":1
},
"x":{
"type":"int64",
"id":2
},
"y":{
"type":"int64",
"id":3
},
"wx":{
"type":"int64",
"id":4
},
"wy":{
"type":"int64",
"id":5
}
}
},
"Flag":{
"fields":{
"time":{
"type":"int64",
"id":1
},
"name":{
"type":"string",
"id":2
}
}
}
}
}
}
然后它创建一个“InteractionCollection”消息的新实例,并向其推送新的交互,mouseMovements和标志。
instance = message.create({
url : "someurl",
interactions : [],
mouseMovements : []
})
var some_interaction = interaction_message.create({
time : Date.now(),
elementId : "idstring",
elementType : "typestring",
type : "anotherstring",
additionalInfo : "infostring"
});
instance.interactions.push(some_interaction);
在脚本结束时,它会以序列化格式将数据发布到服务器,如下所示:
navigator.sendBeacon("someserverpath", message.encode(instance).finish());
我正在使用C#,所以我通过NuGet(Google.Protobuf)安装了官方Google Protocol Buffer Package并创建了一个proto文件来复制上面的json描述符:
syntax = "proto3";
option csharp_namespace = "Proto2"; //C# Project Name: Proto2
message InteractionCollection {
repeated Interaction interactions = 1;
repeated MouseMovement mouse_movements = 2;
string url = 3;
repeated Flag flags = 4;
message Flag {
int64 time = 1;
string name = 2;
}
message Interaction {
string type = 1;
int64 time = 2;
string element_id = 3;
string element_type = 4;
string additional_info = 5;
}
message MouseMovement {
int64 time = 1;
int64 x = 2;
int64 y = 3;
int64 wx = 4;
int64 wy = 5;
}
}
然后我使用NuGet包附带的protoc.exe编译了proto文件,并将结果类包含在我的项目中。然后我创建了一个测试InteractionCollection并将其序列化:
InteractionCollection collection = new InteractionCollection
{
Url = "/",
Interactions = {
new Interaction{ Time = 1508602241363, ElementId = "DOM", ElementType = "DOM", Type = "tohru", AdditionalInfo = "" },
new Interaction{ Time = 1508602243075, Type = "focus", AdditionalInfo = "" },
},
};
using (var output = File.Create("csharp_out.dat"))
{
collection.WriteTo(output);
}
在网站上,它序列化了相同的消息。
{
"interactions":[
{"type":"tohru","time":"1508602241363","elementId":"DOM","elementType":"DOM","additionalInfo":""},
{"type":"focus","time":"1508602243075","additionalInfo":""}
],
"url":"/"
}
但是,我从C#项目获得的数据与网站发布的数据略有不同。我的proto文件是错误的还是有另一个原因。显然,这意味着我无法从网站反序列化数据。 C#:
tohruÓÂÍýó+DOM"DOM
focusƒÐÍýó+/
或
\x0A\x18\x0A\x05\x74\x6F\x68\x72\x75\x10\xC3\x93\xC3\x82\xC3\x8D\xC3\xBD\xC3\xB3\x2B\x1A\x03\x44\x4F\x4D\x22\x03\x44\x4F\x4D\x0A\x0E\x0A\x05\x66\x6F\x63\x75\x73\x10\xC6\x92\xC3\x90\xC3\x8D\xC3\xBD\xC3\xB3\x2B\x1A\x01\x2F
JS:
tohruÓÂÍýó+DOM"DOM*
focusÐÍýó+*/
或
\x1A\x0A\x05\x74\x6F\x68\x72\x75\x10\xC3\x93\xC3\x82\xC3\x8D\xC3\xBD\xC3\xB3\x2B\x1A\x03\x44\x4F\x4D\x22\x03\x44\x4F\x4D\x2A\x0A\x10\x0A\x05\x66\x6F\x63\x75\x73\x10\xC2\x83\xC3\x90\xC3\x8D\xC3\xBD\xC3\xB3\x2B\x2A\x1A\x01\x2F
我很抱歉这是一个很长的问题,但我已经摆弄了一个多星期,我真的无法弄清楚问题是什么。感谢您的耐心等待!
更新
经过一些测试,在本地运行protobufjs我注意到Protobuf的C#版本将空字符串解释为null,因此将整个字段留出(就像当你省略一个可选字段时),而protobufjs将它序列化为:
\x12\x00
我还尝试使用protobuf-net而不是Google版本进行相同测试,并将空字符串序列化为:
\x12\x20
有没有办法将此行为更改为protobufjs使用的行为?