使用HtmlAgilityPack将树保存到变量

时间:2013-03-23 14:39:59

标签: c# html-agility-pack

我是C#的新手。我正在寻找HtmlAgilityPack中的类似功能。在Python中,名为BeautifulSoup的解析库存在名为contents的函数。我怎么能通过HtmlAgility来做到这一点?

1 个答案:

答案 0 :(得分:0)

好的,首先获取包含所有内容的文档根目录

//create a new document
var _htmlDoc = new  HtmlAgilityPack.HtmlDocument();

//fill it with html
_htmlDoc.Load(filePath) or _htmlDoc.LoadHtml(string...)

//get the document root node - it has all the contents
var docuemntNode = _htmlDoc.DocumentNode;

然后...使用linq或xpath查询节点

string xpathExpressionSting = "//p";
var contents = htmlDoc.DocumentNode.SelectNodes(xpathExpressionSting)
//this would get paragraph tag nodes