foreach循环的xpath

时间:2017-09-11 20:55:20

标签: c# xpath foreach

我有一个foreach循环来解析html,我正在使用xpath。我需要的是

<p class="section">sectiontext1</p>
<p class="subsection">subtext1</p> ----need this in first loop
<p class="subsection">subtext2</p>  ---need this in first loop
<p class="section">sectiontext2</p>
<p class="subsection">subtext11</p> ---need this in second loop
<p class="subsection">subtext22</p>  ---- need this in second loop
<p class="section">sectiontext3</p>

foreach (HtmlNode sectionNode in htmldocObject.DocumentNode.SelectNodes("//p[@class='section']"))
        {
            count=count+2;
            string text1 = sectionNode.InnerText;

            foreach (HtmlNode subSectionNode in htmldocObject.DocumentNode.SelectNodes("//p[@class='subsection'][following-sibling::p[@class='section'][1] and preceding-sibling::p[@class='section'][2]]"))
            {
                string text = subSectionNode.InnerText;
            }

        }

我想要做的是遍历各个部分并查找特定部分下的每个子部分,进行一些处理,然后转到下一部分以查找该特定部分下的子部分。

1 个答案:

答案 0 :(得分:0)

我无法让XPath正常工作,因为您无法引用变量......但您可以使用LINQ修复查询。

foreach (var section in html.DocumentNode.SelectNodes("//p[@class='section']"))
{
    Console.WriteLine(section.InnerText);
    foreach (var subSection in section?.SelectNodes("following-sibling::p")
                                      ?.TakeWhile(n => n?.Attributes["class"]?.Value != "section")
                                      ?? Enumerable.Empty<HtmlNode>())
        Console.WriteLine("\t" + subSection.InnerText);
}
/*
sectiontext1
        subtext1-1
        subtext1-2
sectiontext2
        subtext2-11
        subtext2-22
sectiontext3
*/

...如果你没有使用VS2015 + ......

foreach (var subSection in (section.SelectNodes("following-sibling::p") ?? Enumerable.Empty<HtmlNode>())
                                   .TakeWhile(n => n.Attributes["class"].Value != "section"))

......在XSLT中......同样的事情......

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="text" indent="yes"/>

  <xsl:template  match="/">
      <xsl:for-each select="//p[@class='section']">
        <xsl:variable name="start" select="." />
        <xsl:value-of select="text()"/><xsl:text>&#10;</xsl:text>
        <xsl:for-each select="following-sibling::p[@class='subsection'][preceding-sibling::p[@class='section'][1]=$start]">
            <xsl:text>&#9;</xsl:text><xsl:value-of select="text()"/><xsl:text>&#10;</xsl:text>
        </xsl:for-each>
      </xsl:for-each>
  </xsl:template>

</xsl:stylesheet>