C#HtmlAgilityPack Xpath问题,找不到H4 innertext

时间:2014-04-07 04:26:45

标签: c# html dom xpath

我有一个方法可以在网页的某个部分找到我要查找的所有内容,但我在尝试在节点内找到H4时遇到困难。 // div [@class =' job']的xpath正确地找到了我正在寻找的所有8个出现。但在我尝试并遍历了8次出现后,我遇到了问题。

以下是我正在查看的代码的HTML输出。

<div class="job_art ">
<div style="background: #444      url('https://a.akamaihd.net/mwfb/mwfb/graphics/jobs/chicago/meet_with_the_south_gang_family_    760x225_01.jpg') 50% 0 no-repeat;">
</div>
</div>
<div class="job_details clearfix">
<h4>Meet With the South Gang Family</h4>
<div class="mastery_bar" title="Indicates how much of this Job you&#39;ve mastered.      Master Jobs to earn Skill Points."><div style="width: 0%" class="noHighlight"></div><p>100%     Mastered</p><div style="width: 0%"><p>100% Mastered</p></div></div><ul class="uses clearfix"     style="width:100px;"><li class="energy" base_value="2" current_value="2" title="Spend 2     Energy to do this Job once.">2</li></ul><ul class="pays clearfix" style="width:120px"     title="Earn XP, City Cash and Loot items while doing Jobs."><li class="experience" base_value="2" current_value="2">2</li><li class="cash_icon_jobs_8" base_value="2" current_value="2">2</li></ul><a id='btn_dojob_1' class='sexy_button_new sexy_energy_new medium orange impulse_buy' selector='#inner_page' requirements='{"energy":2}' precall='BrazilJobs.preDoJob' callback='BrazilJobs.doJob' href='remote/h.php?job=1&tab=1&clkdiv=btn_dojob_1'><span><span>Do Job</span></span></a></div><div class="job_additional_results"><div id="loot-bandit-1" class="lootContainer"></div><div class="previous_loot"></div></div><div id="bandit-contextual-1" class="contextual bandit-contextual"></div>

它总能找到像#34; Clams(Bank)&#34;这样的东西,我不知道怎么做。问题始于

  string MissionName = node.SelectSingleNode("//h4").InnerText;

我尝试了很多xpath,比如// div [h4 [1]],h4 [1]。我只需要第一次出现,因为它只发生一次。我的代码中的问题从哪里开始?

我需要内心文字&#34;与南岗家庭见面&#34;

public static List<string> GetMissions()
    {
        List<string> FoundMissions = new List<string>();

        HTML_CONTENT = HTML_CONTENT.Replace("\r", "");
        HTML_CONTENT = HTML_CONTENT.Replace("\t", "");
        HTML_CONTENT = HTML_CONTENT.Replace("\n", "");
        HTML_CONTENT = HTML_CONTENT.Replace("\\", "");

        HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
        doc.Load(new StringReader(HTML_CONTENT));

        if(doc.DocumentNode == null)
            return FoundMissions;
        var DivNodes = doc.DocumentNode.SelectNodes("//div[@class='job ']");
        if (DivNodes != null)
        {
            string Count = DivNodes.Count.ToString();

就像我说的那样,它发现所有8次出现都很好。我调试并得到上面的HTML,我把它放在这个顶部,所以我认为这部分很好。

            foreach (HtmlNode node in DivNodes)
            {

                string MissionName = node.SelectSingleNode("//h4").InnerText;
            }
        }

        return FoundMissions;
        }


    }

1 个答案:

答案 0 :(得分:1)

您需要通过在开头添加单点(node)明确告知XPath查询与当前.相关:

string MissionName = node.SelectSingleNode(".//h4").InnerText;

否则,XPath将从根节点搜索。这可能是导致你尝试不正确的原因。