使用HTML Agility Pack获取节点右侧的文本

时间:2012-08-02 11:38:52

标签: c# html html-agility-pack

HTML:

<strong>Capture Date/Time:</strong> August 1, 2012 1:05:00 PM EST<br>
<strong>Instructor:</strong> Ash<br>
<strong>Instructor Email:</strong> email@email.com<br>
<strong>Course ID:</strong> Course321<br>

我如何获取每个强节点右侧的文本?

例如,要获取课程ID,我最终会得到一个“Course321”字符串。

代码:

private string getCourseID()
{
    foreach (HtmlAgilityPack.HtmlNode strong in htmlDoc.DocumentNode.SelectNodes("//strong"))
    {
        string innerText = strong.InnerText;

        if (innerText.Contains("Course ID"))
        {
            //select the outer text
            //return outertext;
        }
    }
}

当前代码:

private string getCourseID()
{
    HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlAgilityPack.HtmlDocument();

    string value = "Error";

    foreach (HtmlAgilityPack.HtmlNode strong in htmlDoc.DocumentNode.SelectNodes("//strong"))
    {
        string innerText = strong.InnerText;

        if (innerText.Contains("Course ID"))
        {
            HtmlAgilityPack.HtmlNode sibling = strong.SelectSingleNode("following-sibling::text()");

            value = sibling.InnerText.Trim();

            MessageBox.Show(value);
        }
    }

    return value;
}

2 个答案:

答案 0 :(得分:1)

使用以下 - 兄弟:: * XPath轴:

HtmlNode sibling = strong.SelectSingleNode("following-sibling::text()");
Console.WriteLine("Course ID = " + sibling.InnerText.Trim());

答案 1 :(得分:0)

对于那些与我分享我的XPathofobia的人来说,这样做可以获得强贴标签的兄弟姐妹:

new HtmlDocument().LoadHtml("blah blah blah").DocumentNode.DescendantsAndSelf().Where (dn => dn.Name == "strong").Select (dn => dn.NextSibling.InnerText)