您不想触摸此代码。

Question

我这样得到HTML：

using (var wb = new WebClient())
{
    data = soeArray;
    var response = wb.UploadValues(url, "POST", data);
    string result = System.Text.Encoding.UTF8.GetString(response);
}

但是响应中有像ś这样的unicode代码。是否有任何方法可以将其更改为相应的char？

Answer 1

我认为您要找的是System.Web.HttpUtility.HtmlDecode，或者，如果这不是网络应用，System.Net.WebUtility.HtmlDecode。

例如：

string result = System.Net.WebUtility.HtmlDecode(System.Text.Encoding.UTF8.GetString(response));

Answer 2

这并不像你想象的那么简单。您要返回的代码是十进制Unicode代码点。对于这些，您只需将代码转换为十六进制，并在其前面加上\ u字符。

int decCode = int.Parse(rawCode.Substring(2));
string hexCode = decCode.ToString("X");
char c = Char.Parse("\u" + hexCode);

容易对吗？错误。 HTML中的Unicode字符也可以表示为十六进制代码，如果它们位于代码woth＆amp; #xCODE之前（例如＆amp;＃x2014代表\ u2014）。

很简单，如果代码前面有一个'x'，我们只需添加逻辑，将其解析为十六进制，对吧？

rawCode = rawCode.Substring(2);
if (rawCode[0] == 'x') {
    hexCode = int.Parse(rawCode.Substring(1));
} else {
    int decCode = int.Parse(rawCode);
    hexCode = decCode.ToString("X");
}
char c = Char.Parse("\u" + hexCode);

您不想触摸此代码。

将它留给HTML解码器，你需要做的就是这样。

string s =  System.Net.WebUtility.HtmlDecode("&copy;"); // returns ©

如何将unicode代码更改为char

2 个答案:

您不想触摸此代码。