如何将xml作为UTF-8而不是UTF-16返回

时间:2014-09-08 18:30:56

标签: c# xml utf-8 xml-serialization

我正在使用序列化<T>的例程。它可以工作,但当下载到浏览器时,我看到一个空白页面。我可以在文本编辑器中查看页面源或打开下载,我看到xml,但它是UTF-16,我认为这是为什么浏览器页面显示为空白?

如何修改序列化程序例程以返回UTF-8而不是UTF-16?

返回了XML源:

<?xml version="1.0" encoding="utf-16"?>
<ArrayOfString xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <string>January</string>
  <string>February</string>
  <string>March</string>
  <string>April</string>
  <string>May</string>
  <string>June</string>
  <string>July</string>
  <string>August</string>
  <string>September</string>
  <string>October</string>
  <string>November</string>
  <string>December</string>
  <string />
</ArrayOfString>

对序列化程序的示例调用:

DateTimeFormatInfo dateTimeFormatInfo = new DateTimeFormatInfo();
var months = dateTimeFormatInfo.MonthNames.ToList();

string SelectionId = "1234567890";

return new XmlResult<List<string>>(SelectionId)
{
    Data = months
};

The Serializer:

public class XmlResult<T> : ActionResult
{
    private string filename = DateTime.Now.ToString("ddmmyyyyhhss");

    public T Data { private get; set; }

    public XmlResult(string selectionId = "")
    {
        if (selectionId != "")
        {
            filename = selectionId;
        }
    }

    public override void ExecuteResult(ControllerContext context)
    {
        HttpContextBase httpContextBase = context.HttpContext;
        httpContextBase.Response.Buffer = true;
        httpContextBase.Response.Clear();

        httpContextBase.Response.AddHeader("content-disposition", "attachment; filename=" + filename + ".xml");
        httpContextBase.Response.ContentType = "text/xml";

        using (StringWriter writer = new StringWriter())
        {
            XmlSerializer xml = new XmlSerializer(typeof(T));
            xml.Serialize(writer, Data);
            httpContextBase.Response.Write(writer);
        }
    }
}

3 个答案:

答案 0 :(得分:20)

您可以使用强制UTF8的StringWriter。这是一种方法:

public class Utf8StringWriter : StringWriter
{
    // Use UTF8 encoding but write no BOM to the wire
    public override Encoding Encoding
    {
         get { return new UTF8Encoding(false); } // in real code I'll cache this encoding.
    }
}

然后在代码中使用Utf8StringWriter编写器。

using (StringWriter writer = new Utf8StringWriter())
{
    XmlSerializer xml = new XmlSerializer(typeof(T));
    xml.Serialize(writer, Data);
    httpContextBase.Response.Write(writer);
}

答案的灵感来自Serializing an object as UTF-8 XML in .NET

答案 1 :(得分:6)

响应编码

我对框架的这一部分不太熟悉。但根据MSDN,您可以像这样设置content encoding of an HttpResponse

httpContextBase.Response.ContentEncoding = Encoding.UTF8;

XmlSerializer看到的编码

再次阅读你的问题后,我发现这是一个艰难的部分。问题在于StringWriter的使用。因为.NET字符串总是存储为UTF-16(引用需要^^),所以StringWriter将其作为编码返回。因此XmlSerializer将XML声明写为

<?xml version="1.0" encoding="utf-16"?>

要解决这个问题,你可以像这样写入MemoryStream:

using (MemoryStream stream = new MemoryStream())
using (StreamWriter writer = new StreamWriter(stream, Encoding.UTF8))
{
    XmlSerializer xml = new XmlSerializer(typeof(T));
    xml.Serialize(writer, Data);

    // I am not 100% sure if this can be optimized
    httpContextBase.Response.BinaryWrite(stream.ToArray());
}

其他方法

另一个编辑:我刚注意到jtm001链接的this SO answer。压缩解决方案,为XmlSerializer提供自定义XmlWriter,配置为使用UTF8作为编码。

Athari proposes派生自StringWriter并将编码宣传为UTF8。

据我所知,两种解决方案都应该有效。我认为这里的内容是你需要一种样板代码或其他代码......

答案 2 :(得分:1)

序列化为UTF8字符串:

    private string Serialize(MyData data)
    {
        XmlSerializer ser = new XmlSerializer(typeof(MyData));
        // Using a MemoryStream to store the serialized string as a byte array, 
        // which is "encoding-agnostic"
        using (MemoryStream ms = new MemoryStream())
            // Few options here, but remember to use a signature that allows you to 
            // specify the encoding  
            using (XmlTextWriter tw = new XmlTextWriter(ms, Encoding.UTF8)) 
            {
                tw.Formatting = Formatting.Indented;
                ser.Serialize(tw, data);
                // Now we get the serialized data as a string in the desired encoding
                return Encoding.UTF8.GetString(ms.ToArray());
            }
    }

要在网络响应中将其作为XML返回,请不要忘记设置响应编码:

    string xml = Serialize(data);
    Response.ContentType = "application/xml";
    Response.ContentEncoding = System.Text.Encoding.UTF8;
    Response.Output.Write(xml);