我需要使用编码ISO-8859-1从Web读取XML文件。用它创建一个XmlDocument后,我试图将它的一些InnerText转换为UTF。但那并没有奏效。然后我试图改变HttpClient上的编码。响应字符串格式正确,但在创建XmlDocument时,应用程序崩溃时出现异常:HRESULT:0xC00CE55F或XML字符串上的非预期字符。我该如何解决这个问题?
代码段:
private static async Task<string> GetResultsAsync(string uri)
{
var client = new HttpClient();
var response = await client.GetByteArrayAsync(uri);
var responseString = Encoding.GetEncoding("iso-8859-1").GetString(response, 0, response.Length - 1);
return responseString;
}
public static async Task GetPodcasts(string url)
{
var progrmas = await GetGroupAsync("prog");
HttpClient client = new HttpClient();
//Task<string> pedido = client.GetStringAsync(url);
//string res = await pedido; //Gets the string with the wrong chars, LoadXml doesn't fails
res = await GetResultsAsync(url); //Gets the string properly formatted
XmlDocument doc = new XmlDocument();
doc.LoadXml(res); //Crashes here
XmlElement root = doc.DocumentElement;
XmlNodeList nodes = root.SelectNodes("//item");
//Title
var node_titles = root.SelectNodes("//item/title");
IEnumerable<string> query_titles = from nodess in node_titles select nodess.InnerText;
List<string> list_titles = query_titles.ToList();
//........
for (int i = 0; i < list_titles.Count; i++)
{
PodcastItem podcast = new PodcastItem();
string title = list_titles[i];
//First attempt to convert a field from the XmlDocument, with the wrong chars. Only replaces the bad encoding with a '?':
//Encoding iso = Encoding.GetEncoding("ISO-8859-1");
//Encoding utf8 = Encoding.UTF8;
//byte[] utfBytes = utf8.GetBytes(title);
//byte[] isoBytes = Encoding.Convert(utf8, iso, utfBytes);
//string msg = iso.GetString(isoBytes, 0, isoBytes.Length - 1);
PodcastItem dataItem = new PodcastItem(title + pubdate, title, link, description, "", pubdate);
progrmas.Items.Add(dataItem);
}
}
答案 0 :(得分:1)
我不确定你为什么试图摆弄自己的编码,但是你崩溃的原因很可能是因为你忘了取数组的最后一个字节。这段代码适合我:
static async Task<string> LoadDecoced()
{
var client = new HttpClient();
var response = await client.GetByteArrayAsync("http://www.rtp.pt/play/podcast/469");
var responseString = Encoding
.GetEncoding("iso-8859-1")
.GetString(response, 0, response.Length); // no -1 here, we want all bytes!
return responseString;
}
如果我让HttpClient弄明白你的代码对我有用:
static async Task<string> Load()
{
var hc = new HttpClient();
string s = await hc.GetStringAsync("http://www.rtp.pt/play/podcast/469");
return s;
}
static void Main(string[] args)
{
var xd = new XmlDocument();
string res = Load().Result;
xd.LoadXml(res);
var node_titles = xd.DocumentElement.SelectNodes("//item/title");
Console.WriteLine(node_titles.Count);
}
如果您使用的是非移动/非WinRT,则XmlDocument.Load接受的流也会相同:
static async Task<Stream> LoadStream()
{
var hc = new HttpClient();
var stream = await hc.GetStreamAsync("http://www.rtp.pt/play/podcast/469");
return stream;
}
static void Main(string[] args)
{
var xd2 = new XmlDocument();
xd2.Load(LoadStream().Result);
var node_titles2 = xd2.DocumentElement.SelectNodes("//item/title");
Console.WriteLine(node_titles2.Count);
}
这是我的控制台中的结果:
你确定你没有在其他地方编码吗?
作为一般建议:框架类能够处理大多数常见的编码方案。尽量让它工作而不必乱用编码类。