XmlWriter编码问题

时间:2009-05-14 13:50:59

标签: .net xml encoding xmlwriter

我有以下代码:

    MemoryStream ms = new MemoryStream();
    XmlWriter w = XmlWriter.Create(ms);

    w.WriteStartDocument(true);
    w.WriteStartElement("data");

    w.WriteElementString("child", "myvalue");

    w.WriteEndElement();//data
    w.Close();
    ms.Close();

    string test = UTF8Encoding.UTF8.GetString(ms.ToArray());

正确生成XML;但是,我的问题是字符串'test'的第一个字符是ï(char#239),使得它对某些xml解析器无效:这是来自哪里?我究竟做错了什么?

我知道我可以通过在第一个角色之后开始解决问题,但我宁愿知道它为什么存在而不仅仅是修补问题。

谢谢!

5 个答案:

答案 0 :(得分:13)

在这里找到一个解决方案: http://www.timvw.be/generating-utf-8-with-systemxmlxmlwriter/

我在顶部错过了这个:

XmlWriterSettings xmlWriterSettings = new XmlWriterSettings();
xmlWriterSettings.Encoding = new UTF8Encoding(false);
MemoryStream ms = new MemoryStream();
XmlWriter w = XmlWriter.Create(ms, xmlWriterSettings);

感谢大家的帮助!

答案 1 :(得分:2)

问题是当您使用UTF-8将其转换为字符串时,编写器生成的XML是UTF-16。试试这个:

StringBuilder sb = new StringBuilder();
using (StringWriter writer = new StringWriter(sb))
using (XmlWriter w = XmlWriter.Create(writer))
{
    w.WriteStartDocument(true);
    w.WriteStartElement("data");

    w.WriteElementString("child", "myvalue");

    w.WriteEndElement();//data
}

string test = sb.ToString();

答案 2 :(得分:1)

答案 3 :(得分:0)

你可以改变这样的编码:

w.Settings.Encoding = Encoding.UTF8;

答案 4 :(得分:0)

如果您关心编辑使用的字节顺序标记(例如Visual Studio正确检测UTF8编码的XML和语法高亮显示),则所有这些都略有偏差。

这是一个解决方案:

MemoryStream stream = new MemoryStream();

XmlWriterSettings settings = new XmlWriterSettings();
settings.Encoding = Encoding.UTF8;
settings.Indent = true;
settings.IndentChars = "\t";

using (XmlWriter writer = XmlWriter.Create(stream, settings))
{
    // ... write

    // Make sure you flush or you only get half the text
    writer.Flush();

    // Use a StreamReader to get the byte order correct
    StreamReader reader = new StreamReader(stream,Encoding.UTF8,true);
    stream.Seek(0, SeekOrigin.Begin);
    result = reader.ReadToEnd();
}

我已经完整地获得了2个片段here