我正在序列化一个包含String属性中HTML数据的对象。
Dim Formatter As New Xml.Serialization.XmlSerializer(GetType(MyObject))
Dim fs As New FileStream(FilePath, FileMode.Create)
Formatter.Serialize(fs, Ob)
fs.Close()
但是当我将XML读回Object:
Dim Formatter As New Xml.Serialization.XmlSerializer(GetType(MyObject))
Dim fs As New FileStream(FilePath, FileMode.Open)
Dim Ob = CType(Formatter.Deserialize(fs), MyObject)
fs.Close()
我收到此错误:
"'', hexadecimal value 0x14, is an invalid character. Line 395, position 22."
.NET不应该阻止这种错误,转义无效字符吗?
这里发生了什么,我该如何解决?
答案 0 :(得分:6)
我将XmlReaderSettings属性CheckCharacters设置为false。 如果您通过XmlSerializer自行序列化数据,我只建议这样做。如果它来自一个未知来源,那么这不是一个好主意。
public static T Deserialize<T>(string xml)
{
var xmlReaderSettings = new XmlReaderSettings() { CheckCharacters = false };
XmlReader xmlReader = XmlTextReader.Create(new StringReader(xml), xmlReaderSettings);
XmlSerializer xs = new XmlSerializer(typeof(T));
return (T)xs.Deserialize(xmlReader);
}
答案 1 :(得分:2)
序列化步骤中确实应该失败,因为0x14
is an invalid value for XML。 没有办法逃脱它,甚至没有
,因为它被排除在XML模型中作为有效字符。我真的很惊讶序列化器让它通过,因为它使序列化程序不合格。
是否可以在序列化之前从字符串中删除无效字符?出于什么目的,您在HTML中有0x14
?
或者,您是否可能使用一种编码进行编写,并使用另一种编码进行阅读?
答案 2 :(得分:1)
您应该发布您尝试序列化和反序列化的类的代码。与此同时,我会猜测。
最有可能的是,无效字符位于string
类型的字段或属性中。您需要将其序列化为一个字节数组,假设您无法避免让该字符出现:
[XmlRoot("root")]
public class HasBase64Content
{
internal HasBase64Content()
{
}
[XmlIgnore]
public string Content { get; set; }
[XmlElement]
public byte[] Base64Content
{
get
{
return System.Text.Encoding.UTF8.GetBytes(Content);
}
set
{
if (value == null)
{
Content = null;
return;
}
Content = System.Text.Encoding.UTF8.GetString(value);
}
}
}
这会产生如下XML:
<?xml version="1.0" encoding="utf-8"?>
<root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Base64Content>AAECAwQFFA==</Base64Content>
</root>
我觉得你可能更喜欢VB.NET:
''# Prettify doesn't like attributes as the first item in a VB code block, so this comment is here so that it looks right on StackOverflow.
<XmlRoot("root")> _
Public Class HasBase64Content
Private _content As String
<XmlIgnore()> _
Public Property Content() As String
Get
Return _content
End Get
Set(ByVal value As String)
_content = value
End Set
End Property
<XmlElement()> _
Public Property Base64Content() As Byte()
Get
Return System.Text.Encoding.UTF8.GetBytes(Content)
End Get
Set(ByVal value As Byte())
If Value Is Nothing Then
Content = Nothing
Return
End If
Content = System.Text.Encoding.UTF8.GetString(Value)
End Set
End Property
End Class
答案 3 :(得分:0)
我会exepct .NET来处理这个问题,但你也可以查看XmlSerializer类和XmlReaderSettings(参见下面的示例泛型方法):
public static T Deserialize<T>(string xml)
{
var xmlReaderSettings = new XmlReaderSettings()
{
ConformanceLevel = ConformanceLevel.Fragment,
ValidationType = ValidationType.None
};
XmlReader xmlReader = XmlTextReader.Create(new StringReader(xml), xmlReaderSettings);
XmlSerializer xs = new XmlSerializer(typeof(T), "");
return (T)xs.Deserialize(xmlReader);
}
我还会检查代码中是否存在编码(Unicode,UTF8等)问题。十六进制值0x14不是您在XML中期望的char:)