通过利用我在网上找到的一些样本here,我编写了一些XML序列化方法。
我注意到 Method1 中的xml字符串包含一个前导'?'。使用 Method2 重建Object时,这似乎没问题。
但是当在应用程序中进行一些测试时,有时我们会领先'???'代替。这导致 Method2 在尝试重建Object时抛出异常。 在这种情况下,'对象'只是一个简单的int。
System.InvalidOperationException未处理 Message =“XML文档(1,1)中存在错误。” 来源= “的System.Xml” 堆栈跟踪: 在System.Xml.Serialization.XmlSerializer.Deserialize(XmlReader xmlReader,String encodingStyle,XmlDeserializationEvents事件) 在System.Xml.Serialization.XmlSerializer.Deserialize(XmlReader xmlReader,String encodingStyle) 在System.Xml.Serialization.XmlSerializer.Deserialize(Stream stream) at XMLSerialization.Program.DeserializeXmlStringToObject(String xmlString,String objectType)在C:\ Documents and Settings \ ... Projects \ XMLSerialization \ Program.cs:第96行 在C:\ Documents and Settings \ ... Projects \ XMLSerialization \ Program.cs中的XMLSerialization.Program.Main(String [] args):第49行
是否有人能够解释可能导致此问题的原因?
以下是我编写的迷你测试程序的示例代码,该代码作为VS控制台应用程序运行。它会显示XML字符串。您还可以取消注释区域以附加额外的前导'??'重现例外。
using System;
using System.IO;
using System.Text;
using System.Xml;
using System.Xml.Serialization;
namespace XMLSerialization
{
class Program
{
static void Main(string[] args)
{
// deserialize to string
#region int
object inObj = 5;
#endregion
#region string
//object inObj = "Testing123";
#endregion
#region list
//List inObj = new List();
//inObj.Add("0:25");
//inObj.Add("1:26");
#endregion
string[] stringArray = SerializeObjectToXmlString(inObj);
#region include leading ???
//int indexOfBracket = stringArray[0].IndexOf('<');
//stringArray[0] = "??" + stringArray[0];
#endregion
#region strip out leading ???
//int indexOfBracket = stringArray[0].IndexOf('<');
//string trimmedString = stringArray[0].Substring(indexOfBracket);
//stringArray[0] = trimmedString;
#endregion
Console.WriteLine("Input");
Console.WriteLine("-----");
Console.WriteLine("Object Type: " + stringArray[1]);
Console.WriteLine();
Console.WriteLine("XML String: " + Environment.NewLine + stringArray[0]);
Console.WriteLine(String.Empty);
// serialize back to object
object outObj = DeserializeXmlStringToObject(stringArray[0], stringArray[1]);
Console.WriteLine("Output");
Console.WriteLine("------");
#region int
Console.WriteLine("Object: " + (int)outObj);
#endregion
#region string
//Console.WriteLine("Object: " + (string)outObj);
#endregion
#region list
//string[] tempArray;
//List list = (List)outObj;
//foreach (string pair in list)
//{
// tempArray = pair.Split(':');
// Console.WriteLine(String.Format("Key:{0} Value:{1}", tempArray[0], tempArray[1]));
//}
#endregion
Console.Read();
}
private static string[] SerializeObjectToXmlString(object obj)
{
XmlTextWriter writer = new XmlTextWriter(new MemoryStream(), Encoding.UTF8);
writer.Formatting = Formatting.Indented;
XmlSerializer serializer = new XmlSerializer(obj.GetType());
serializer.Serialize(writer, obj);
MemoryStream stream = (MemoryStream)writer.BaseStream;
string xmlString = UTF8ByteArrayToString(stream.ToArray());
string objectType = obj.GetType().FullName;
return new string[]{xmlString, objectType};
}
private static object DeserializeXmlStringToObject(string xmlString, string objectType)
{
MemoryStream stream = new MemoryStream(StringToUTF8ByteArray(xmlString));
XmlSerializer serializer = new XmlSerializer(Type.GetType(objectType));
object obj = serializer.Deserialize(stream);
return obj;
}
private static string UTF8ByteArrayToString(Byte[] characters)
{
UTF8Encoding encoding = new UTF8Encoding();
return encoding.GetString(characters);
}
private static byte[] StringToUTF8ByteArray(String pXmlString)
{
UTF8Encoding encoding = new UTF8Encoding();
return encoding.GetBytes(pXmlString);
}
}
}
答案 0 :(得分:9)
当我以前遇到过这种情况时,它通常与编码有关。我在序列化对象时尝试指定编码。尝试使用以下代码。另外,是否有任何特定原因需要返回string[]
数组?我已经改变了使用泛型的方法,因此您不必指定类型。
private static string SerializeObjectToXmlString<T>(T obj)
{
XmlSerializer xmls = new XmlSerializer(typeof(T));
using (MemoryStream ms = new MemoryStream())
{
XmlWriterSettings settings = new XmlWriterSettings();
settings.Encoding = Encoding.UTF8;
settings.Indent = true;
settings.IndentChars = "\t";
settings.NewLineChars = Environment.NewLine;
settings.ConformanceLevel = ConformanceLevel.Document;
using (XmlWriter writer = XmlTextWriter.Create(ms, settings))
{
xmls.Serialize(writer, obj);
}
string xml = Encoding.UTF8.GetString(ms.ToArray());
return xml;
}
}
private static T DeserializeXmlStringToObject <T>(string xmlString)
{
XmlSerializer xmls = new XmlSerializer(typeof(T));
using (MemoryStream ms = new MemoryStream(Encoding.UTF8.GetBytes(xmlString)))
{
return (T)xmls.Deserialize(ms);
}
}
如果您仍然遇到问题,请尝试在代码Encoding.ASCII
的任何地方使用Encoding.UTF8
,除非您有使用UTF8的特定原因。我不确定原因,但在序列化的某些情况下,我看到UTF8编码导致了这个确切的问题。
答案 1 :(得分:3)
这是BOM符号。你可以删除它
if (xmlString.Length > 0 && xmlString[0] != '<')
{
xmlString = xmlString.Substring(1, xmlString.Length - 1);
}
或使用UTF32序列化
using (StringWriter writer = new StringWriter(CultureInfo.InvariantCulture))
{
serializer.Serialize(writer, instance);
result = writer.ToString();
}
反序列化
object result;
using (StringReader reader = new StringReader(instance))
{
result = serializer.Deserialize(reader);
}
如果你只在里面使用这个代码。使用UTF32的.Net应用程序不会产生问题,因为它是.Net中所有内容的默认编码。