我正在使用htmlAgilityPack,并从网站上抓取一张桌子。
如何修改此值以返回每行,每隔一列的值。
public static void SearchAnimal(string param)
{
string prm = param;
string url = "http://xxx/xxx.action?name=";
//HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url+prm);
//HttpWebResponse response = (HttpWebResponse)request.GetResponse();
//StreamReader stream = new StreamReader(response.GetResponseStream());
//string final_response = stream.ReadToEnd();
var webGet = new HtmlWeb();
var doc = webGet.Load(url + prm);
HtmlNodeCollection tr = doc.DocumentNode.SelectNodes("//table[@id='animal']//tbody//tr//td");
for(int i = 0; i <= tr.Count; ++i){
var link = tr
.Descendants("a")
.First(x => x.Attributes["href"] != null);
string hrefValue = link.Attributes["href"].Value;
string name = link.InnerHtml;
Match match = Regex.Match(hrefValue, @"(\d+)$");
Console.ForegroundColor = ConsoleColor.DarkGray;
Console.WriteLine("Result " + tr + ":");
Console.ForegroundColor = ConsoleColor.Gray;
Console.WriteLine("Animal Name: " + name);
Console.WriteLine("Animal Key: " + match.Value);
Console.WriteLine("-------------------------");
Console.WriteLine("");
}
}
答案 0 :(得分:1)
您可以使用XPath位置过滤器从每个<td>
中仅获取第二个<tr>
子项:
//table[@id='animal']//tbody//tr/td[2]
它实际上等于CSS :nth-of-type()
选择器,并且只有当所有子节点属于同一类型时才显示与:nth-child()
相同的输出(在这种情况下,所有子节点均为<td>
)