Question

我正在尝试使用HTML Agility Pack从以下内容中获取描述文本：

<meta name="description" content="**this is the text i want to extract and store in a string**" />

不久前有人在Stackoverflow上建议我使用HTMLAgilityPack。但我不知道如何使用它，我找到的文档（包括下载中包含的文档）都有无效的链接，因此无法查看文档。

有人可以帮我解决这个问题吗？

Answer 1

用法与XmlDocument非常相似;您可以在XmlDocument上使用MSDN进行广泛的概述;您可能还想学习xpath语法（MSDN）。

示例：

HtmlDocument doc = new HtmlDocument();
doc.Load(path); // or .LoadHtml(html);
HtmlNode node = doc.DocumentNode.SelectSingleNode("//meta[@name='description']");
if (node != null) {
    string desc = node.GetAttributeValue("content", "");
    // TODO: write desc somewhere
}

GetAttributeValue的第二个参数是在找不到属性时返回的默认值。

Answer 2

public string HtmlAgi（string url，string key） {

    var Webget = new HtmlWeb();
    var doc = Webget.Load(url);
    HtmlNode ourNode = doc.DocumentNode.SelectSingleNode(string.Format("//meta[@name='{0}']", key));

    if (ourNode != null)
    {


            return ourNode.GetAttributeValue("content", "");

    }
    else
    {
        return "not fount";
    }

}

HTML敏捷包

2 个答案: