Question

我使用了以下视频中的代码示例：https://youtu.be/8e3Wklc1H_A

代码看起来像这样

var webGet = new HtmlWeb();
var doc = webGet.Load("http://pastebin.com/raw.php?i=gF0DG08s");

HtmlNode OurNone = doc.DocumentNode.SelectSingleNode("//div[@id='footertext']");

if (OurNone != null)
    richTextBox1.Text = OurNone.InnerHtml;
else
    richTextBox1.Text = "nothing found";

我起初认为原来的网站可能已经关闭了（www.fuchsonline.com）所以我很快制作了一个只有一个页脚的HTML并将其粘贴在Pastebin上（上面代码中的链接）

<html>
<body>

<div id="footertext">
                 <p>
                     Copyright &copy; FUCHS Online Ltd, 2013. All Rights Reserved.
                 </p>
</div>

</body>
</html>

当在代码中使用Pastebin链接时，程序总是将“nothing found”写入richTextBox。但是，视频中使用的网站仍在使用，所以我尝试在webGet中使用该网站，瞧 - 它有效。

现在我想问一下每个代码到底出了什么问题。 HTML是否遗漏了某些内容，或者仅针对完整网站制作了该程序，如果是，那么网站的完成情况是什么？

Answer 1

这是一种更简单的方法：

WebClient webClient = new WebClient();
string htmlCode = webClient.DownloadString("http://pastebin.com/raw.php?i=gF0DG08s");

HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(htmlCode);

HtmlNode OurNone = doc.DocumentNode.SelectSingleNode("//div[@id='footertext']");

if (OurNone != null)
    richTextBox1.Text = OurNone.InnerHtml;
else
    richTextBox1.Text = "nothing found";

Answer 2

在这种情况下，您只是将原始html作为字符串保存到此页面，这就是它返回空的原因。如果你真的想用HTML敏捷包解析它，你可以先下载页面，获取原始HTML，然后将其解析为敏捷包的文档模型。

        WebRequest webRequest = HttpWebRequest.Create("http://pastebin.com/raw.php?i=gF0DG08s");
        webRequest.Method = "GET";
        string pageSource;
        using (StreamReader reader = new StreamReader(webRequest.GetResponse().GetResponseStream()))
        {
            pageSource = reader.ReadToEnd();
            HtmlDocument html = new HtmlDocument();
            html.LoadHtml(pageSource);
            HtmlNode OurNone = html.DocumentNode.SelectSingleNode("//div[@id='footertext']");
            if (OurNone != null)
            {
                richTextBox1.Text = OurNone.InnerHtml;
            }
            else
            {
                richTextBox1.Text = "nothing found";
            }
        }

使用HTMLAgilityPack找不到节点

2 个答案: