Question

好的我有兴趣尝试在这些标签之间找到文字：

<font color="#00006b">Aa Megami-sama (OAV 2011)</font>

我有一个名称在同一个标签中的列表，我想抓住它们并将它们放入动态数组列表中。

我尝试使用HTMLAgilityPack执行此操作但是当我运行程序时会发生这种情况： enter image description here

Answer 1

LoadHtml（）方法将HTML作为输入，而不是URL。您需要自己获取HTML。

例如：

        using (var webclient = new WebClient())
        {
            var html = webclient.DownloadString("http://www.animenewsnetwork.com/encyclopedia/anime.php?list=A");

            HtmlDocument doc = new HtmlDocument();
            doc.LoadHtml(html);
            var node = doc.DocumentNode.SelectSingleNode("//font");
            Console.WriteLine(node.InnerText);
            Console.ReadKey();
        }

Answer 2

您的SelectSingleNode()正在返回null值。所以你需要对最后一行进行“空检查”。这样做：

if(node != null)
{
   Messagebox.Show(node.InnerText);
}

Answer 3

首先，以这种方式使用LoadHtml方法获取html文件数据。

var webclient = new WebClient();
HTMLAgilityPack.HtmlDocument doc = new HTMLAgilityPack.HtmlDocument();
doc.LoadHtml(webClient.DownloadString(@"http://www.animenewsnetwork.com/encyclopedia/anime.php?list=A"));

由于可能存在无效的元数据字符集，现在这可能无法用作explained here。在这种情况下，您可以在那里使用答案，其中解决方法是手动阅读回复（HttpWebRequest和HttpWebResponse）。

接下来，您可能需要检测并处理其他解析错误（包括上面的错误），如果有explained here这样的话：

   if (doc.ParseErrors!=null && doc.ParseErrors.Count>0)
   {
       // Handle any parse errors as required
   }
   else
   {
        if (doc.DocumentNode != null)
        {
            HtmlNode fontNode = doc.DocumentNode.SelectSingleNode("//font");
            if (fontNode != null)
            {
                // Do something with fontNode
                MessageBox.Show(fontNode.InnerText);
            }
        }
    }

C＃在html标签中抓取文本

3 个答案: