使用agilitypack从网页获取值时出现编码错误

时间:2015-03-31 17:41:26

标签: c# encoding html-agility-pack

作为编程的新手,我将非常感谢您提供的任何帮助。我想在c#中进行电话查询。网页的内容是希腊文,除了@#@#$ @ ### @ ########,我无法获得其他文字。仅显示拉丁字符。尝试了很多编码方法,但到目前为止还没有。提前感谢您的所有建议。

代码=

string url =" http://www.11888.gr/list-names?_wpType=number&_wpPhone=2107255555";

                System.Net.WebClient wc = new System.Net.WebClient();
                string xml = wc.DownloadString(url);

               byte[] byteArray = wc.DownloadData(new Uri(url));
                Stream stream = new MemoryStream(byteArray);

                HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlAgilityPack.HtmlDocument();
                   htmlDoc.Load(stream);

               var SpanNodes = htmlDoc.DocumentNode.SelectNodes("//div[@class='details']");    

                if (SpanNodes != null)
                {
                    foreach (HtmlNode SN in SpanNodes)
                    {
                        string text = SN.InnerText.Trim();

                        richTextBox1.Text = text;


                    }
                }

1 个答案:

答案 0 :(得分:0)

您必须使用UTF8编码。

       System.Text.Encoding utf8 = System.Text.Encoding.UTF8;
        // unicode string:
        string unicodeStr = "étext";
        // Convert string to utf-8 bytes to transport 
        byte[] byteBuffUtf8 = System.Text.Encoding.UTF8.GetBytes(unicodeStr);
        // Convert back utf-8 to display ready string.
        string myUnicode = System.Text.Encoding.UTF8.GetString(byteBuffUtf8);
        MessageBox.Show(myUnicode);

这是MSDN示例

https://msdn.microsoft.com/en-us/library/system.text.utf8encoding%28v=vs.110%29.aspx