Question

我使用以下代码下载this XML file：

private async static Task<string> DownloadPageAsync(string url)
{
    try
    {
        HttpClientHandler handler = new HttpClientHandler();
        handler.UseDefaultCredentials = true;
        handler.AllowAutoRedirect = true;
        handler.UseCookies = true;
        HttpClient client = new HttpClient(handler);
        client.MaxResponseContentBufferSize = 10000000;
        HttpResponseMessage response = await client.GetAsync(url);
        response.EnsureSuccessStatusCode();

        string responseBody = response.Content.ReadAsString();
        return responseBody;
    }
    catch (Exception ex)
    {
        return "error" + ex.Message;
    }
}

但我收到的文件似乎有编码问题。虽然文档格式不正确，但我猜测我下载的网页也不是UTF-8。如何返回UTF-8字符串？感谢。

Answer 1

我建议使用HTML Agility Pack为您下载和解析文档 - 它会自动检测编码（如果可能），所以这对您来说应该不是问题。

如果这不是一个选项，您需要知道文档使用的编码，然后使用Encoding类将其转换为UTF8，以便从原始编码转换为UTF8。

Answer 2

您的链接编码为 iso-8859-1 。

使用

<强> XMLDocument.load方法（uriString中）

或

<强> XDocument.Load （uriString中）

下载UTF-8网页

2 个答案: