我有以下代码:
MemoryStream ms = new MemoryStream();
XmlWriter w = XmlWriter.Create(ms);
w.WriteStartDocument(true);
w.WriteStartElement("data");
w.WriteElementString("child", "myvalue");
w.WriteEndElement();//data
w.Close();
ms.Close();
string test = UTF8Encoding.UTF8.GetString(ms.ToArray());
正确生成XML;但是,我的问题是字符串'test'的第一个字符是ï(char#239),使得它对某些xml解析器无效:这是来自哪里?我究竟做错了什么?
我知道我可以通过在第一个角色之后开始解决问题,但我宁愿知道它为什么存在而不仅仅是修补问题。
谢谢!
答案 0 :(得分:13)
在这里找到一个解决方案: http://www.timvw.be/generating-utf-8-with-systemxmlxmlwriter/
我在顶部错过了这个:
XmlWriterSettings xmlWriterSettings = new XmlWriterSettings();
xmlWriterSettings.Encoding = new UTF8Encoding(false);
MemoryStream ms = new MemoryStream();
XmlWriter w = XmlWriter.Create(ms, xmlWriterSettings);
感谢大家的帮助!
答案 1 :(得分:2)
问题是当您使用UTF-8将其转换为字符串时,编写器生成的XML是UTF-16。试试这个:
StringBuilder sb = new StringBuilder();
using (StringWriter writer = new StringWriter(sb))
using (XmlWriter w = XmlWriter.Create(writer))
{
w.WriteStartDocument(true);
w.WriteStartElement("data");
w.WriteElementString("child", "myvalue");
w.WriteEndElement();//data
}
string test = sb.ToString();
答案 2 :(得分:1)
答案 3 :(得分:0)
你可以改变这样的编码:
w.Settings.Encoding = Encoding.UTF8;
答案 4 :(得分:0)
如果您关心编辑使用的字节顺序标记(例如Visual Studio正确检测UTF8编码的XML和语法高亮显示),则所有这些都略有偏差。
这是一个解决方案:
MemoryStream stream = new MemoryStream();
XmlWriterSettings settings = new XmlWriterSettings();
settings.Encoding = Encoding.UTF8;
settings.Indent = true;
settings.IndentChars = "\t";
using (XmlWriter writer = XmlWriter.Create(stream, settings))
{
// ... write
// Make sure you flush or you only get half the text
writer.Flush();
// Use a StreamReader to get the byte order correct
StreamReader reader = new StreamReader(stream,Encoding.UTF8,true);
stream.Seek(0, SeekOrigin.Begin);
result = reader.ReadToEnd();
}
我已经完整地获得了2个片段here