如何使用C#获取div
或更多类in
的内容?
我有以下HTML代码:
<!DOCTYPE html>
<html lang="en" xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta charset="utf-8" />
<title></title>
</head>
<body>
<div id="xxx">
<div class="in">
<a href="/a/show/7184569" class="mm">ВАЗ 2121</a> <span class="for">за</span>
<span class="price">2 700 $</span>
<br />
<span class="year">1990 г.</span><br />
<div style="margin: 3px 0 3px 0">contentxxx</div>
</div>
</div>
</body>
</html>
我想获得div class="in"
的内容,结果是:
<div class="in">
<a href="/a/show/7184569" class="mm">ВАЗ 2121</a> <span class="for">за</span>
<span class="price">2 700 $</span>
<br />
<span class="year">1990 г.</span><br />
<div style="margin: 3px 0 3px 0">contentxxx</div>
</div>
答案 0 :(得分:2)
using HtmlAgilityPack;
static void Parse
{
HtmlWeb web = new HtmlWeb();
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(getHTML());
HtmlNodeCollection nodeCol = doc.DocumentNode.SelectNodes("//div[@class=\"in\"]");
string value = nodeCol[0].InnerHtml;
}
static string getHTML()
{
string retVal = "";
retVal = @"<!DOCTYPE html>"
+ "<html lang=\"en\" xmlns=\"http://www.w3.org/1999/xhtml\">"
+ "<head>"
+ "<meta charset=\"utf-8\" />"
+ "<title></title>"
+ "</head>"
+ "<body>"
+ "<div id=\"xxx\">"
+ "<div class=\"in\">"
+ "<a href=\"/a/show/7184569\" class=\"mm\">ВАЗ 2121</a> <span class=\"for\">за</span>"
+ "<span class=\"price\">2 700 $</span>"
+ "<br />"
+ "<span class=\"year\">1990 г.</span><br />"
+ "<div style=\"margin: 3px 0 3px 0\">contentxxx</div>"
+ "</div>"
+ "</div>"
+ "</body>"
+ "</html>";
return retVal;
}
请添加名称空间HtmlAgilityPack; 参考:http://htmlagilitypack.codeplex.com/releases/view/90925
答案 1 :(得分:0)
您可以使用HTML Agility Pack轻松完成:
using HtmlAgilityPack;
...
var doc = new HtmlDocument();
doc.Load(@"C:\file.htm") //see the overloads. You can also use `LoadHtml` method.
var node = doc.DocumentNode.SelecSingleNode("//div[@class='in']");
//This is the text you are looking for...
var result = node.OuterHtml;
答案 2 :(得分:-2)
使用JQuery获取div的内容:
<script language="text/javascript">
var d = $('div.in').html();
</script>
上面的代码获取了包含in
类的div的内容。