无法使用html敏捷包在div中选择一个表

时间:2017-06-15 10:39:19

标签: html c#-4.0 html-agility-pack

image of div tree 我正在尝试使用htmlagilitypack从网页中的表中抓取数据。 下面是html部分



<div id="table-matches" style="display: block;"><table class=" table-main"><colgroup><col width="50"><col width="*"><col width="50"><col width="50"><col width="50"><col width="50"><col width="50"></colgroup><tbody><tr class="dark center" xtid="28575"><th class="first2 tl" colspan="3"><a class="bfl" href="/hockey/usa/"><span class="ficon f-200">&nbsp;</span>USA</a><span class="bflp">»</span><a href="/hockey/usa/echl/">ECHL</a></th><th>1</th><th>X</th><th>2</th><th xparam="Number of available bookmakers odds~2">B's</th></tr><tr class="odd deactivate" xeid="pn36Jn1f"><td class="table-time datet t1496703900-1-1-0-0 ">04:35</td><td class="name table-participant"><a href="/hockey/usa/echl/south-carolina-stingrays-colorado-eagles-pn36Jn1f/">South Carolina Stingrays - <span class="bold">Colorado Eagles</span></a><span class="ico-event-info" onmouseover="toolTip('Colorado Eagles wins series 4-0. 4th leg.', this, event, '4');allowHideTootip(false);delayHideTip(200);return false;" onmouseout="allowHideTootip(true);delayHideTip(200);">&nbsp;</span></td><td class="center bold table-odds table-score">1:2</td><td class="odds-nowrp" xodd="1.91" xoid="E-2nrdfxv464x0x6av8v"><a href="" onclick="globals.ch.togle(this , 'E-2nrdfxv464x0x6av8v');return false;" xparam="odds_text">1.91</a></td><td class="odds-nowrp" xodd="4.74" xoid="E-2nrdfxv498x0x0"><a href="" onclick="globals.ch.togle(this , 'E-2nrdfxv498x0x0');return false;" xparam="odds_text">4.74</a></td><td class="odds-nowrp result-ok" xodd="2.79" xoid="E-2nrdfxv464x0x6av90"><a href="" onclick="globals.ch.togle(this , 'E-2nrdfxv464x0x6av90');return false;" xparam="odds_text">2.79</a></td><td class="center info-value">1</td></tr><tr class="dark center" xtid="28308"><th class="first2 tl" colspan="3"><a class="bfl" href="/hockey/usa/"><span class="ficon f-200">&nbsp;</span>USA</a><span class="bflp">»</span><a href="/hockey/usa/nhl/">NHL</a></th><th>1</th><th>X</th><th>2</th><th xparam="Number of available bookmakers odds~2">B's</th></tr><tr class="odd deactivate" xeid="EyxiHGE4"><td class="table-time datet t1496707200-1-1-0-0 ">05:30</td><td class="name table-participant"><a href="/hockey/usa/nhl/nashville-predators-pittsburgh-penguins-EyxiHGE4/"><span class="bold">Nashville Predators</span> - Pittsburgh Penguins</a><span class="ico-event-info" onmouseover="toolTip('Series tied 2-2. 4th leg.', this, event, '4');allowHideTootip(false);delayHideTip(200);return false;" onmouseout="allowHideTootip(true);delayHideTip(200);">&nbsp;</span></td><td class="center bold table-odds table-score">4:1</td><td class="odds-nowrp result-ok" xodd="2.15" xoid="E-2ns9hxv464x0x6b2jp"><a href="" onclick="globals.ch.togle(this , 'E-2ns9hxv464x0x6b2jp');return false;" xparam="odds_text">2.15</a></td><td class="odds-nowrp" xodd="3.86" xoid="E-2ns9hxv498x0x0"><a href="" onclick="globals.ch.togle(this , 'E-2ns9hxv498x0x0');return false;" xparam="odds_text">3.86</a></td><td class="odds-nowrp" xodd="2.91" xoid="E-2ns9hxv464x0x6b2jq"><a href="" onclick="globals.ch.togle(this , 'E-2ns9hxv464x0x6b2jq');return false;" xparam="odds_text">2.91</a></td><td class="center info-value">55</td></tr></tbody></table></div>
&#13;
&#13;
&#13;

我一直在尝试组合属性来访问表中的数据,但我得到的只是包含div的初始节点。 这是我使用的代码

  var html = @urlOddsportal;
        HtmlWeb web = new HtmlWeb();
        var htmlDoc = web.Load(html);

        //var html = new HtmlAgilityPack.HtmlDocument();
        //html.LoadHtml(new WebClient().DownloadString(urlOddsportal)); // load a string

        var root = htmlDoc.DocumentNode;



       var node = root.SelectSingleNode("//div[@id='table-matches']");  //this returns non null 

        // all of the below functions return null value

       var rows = node.SelectNodes(".//tr[@class='odd deactivate']");
       var table = root.SelectSingleNode("//table[@class=' table-main']");
       var tablerows = node.SelectNodes(".//table/tbody/tr[1]");//   [@class='odd deactivate']");

       var tabletag = htmlDoc.DocumentNode.SelectNodes("//table[@class='table-main']");

有人可以告诉我哪里出错了。 感谢

1 个答案:

答案 0 :(得分:0)

这会返回var = table吗?

var table = document.DocumentNode.Descendants("table").FirstOrDefault(_ => _.HasProperty("class", " table-main")

有属性=

public static bool HasProperty(this HtmlNode node, string property, params string[] valueArray)
{
    var propertyValue = node.GetAttributeValue(property, "");
    var propertyValues = propertyValue.Split(' ');
    return valueArray.All(c => propertyValues.Contains(c));
}

如果它确实有效,你可以在其他节点上尝试返回null

我更喜欢使用这种方法,因为它比xcode公式更容易阅读