我有一个foreach循环来解析html,我正在使用xpath。我需要的是
<p class="section">sectiontext1</p>
<p class="subsection">subtext1</p> ----need this in first loop
<p class="subsection">subtext2</p> ---need this in first loop
<p class="section">sectiontext2</p>
<p class="subsection">subtext11</p> ---need this in second loop
<p class="subsection">subtext22</p> ---- need this in second loop
<p class="section">sectiontext3</p>
foreach (HtmlNode sectionNode in htmldocObject.DocumentNode.SelectNodes("//p[@class='section']"))
{
count=count+2;
string text1 = sectionNode.InnerText;
foreach (HtmlNode subSectionNode in htmldocObject.DocumentNode.SelectNodes("//p[@class='subsection'][following-sibling::p[@class='section'][1] and preceding-sibling::p[@class='section'][2]]"))
{
string text = subSectionNode.InnerText;
}
}
我想要做的是遍历各个部分并查找特定部分下的每个子部分,进行一些处理,然后转到下一部分以查找该特定部分下的子部分。
答案 0 :(得分:0)
我无法让XPath正常工作,因为您无法引用变量......但您可以使用LINQ
修复查询。
foreach (var section in html.DocumentNode.SelectNodes("//p[@class='section']"))
{
Console.WriteLine(section.InnerText);
foreach (var subSection in section?.SelectNodes("following-sibling::p")
?.TakeWhile(n => n?.Attributes["class"]?.Value != "section")
?? Enumerable.Empty<HtmlNode>())
Console.WriteLine("\t" + subSection.InnerText);
}
/*
sectiontext1
subtext1-1
subtext1-2
sectiontext2
subtext2-11
subtext2-22
sectiontext3
*/
...如果你没有使用VS2015 + ......
foreach (var subSection in (section.SelectNodes("following-sibling::p") ?? Enumerable.Empty<HtmlNode>())
.TakeWhile(n => n.Attributes["class"].Value != "section"))
......在XSLT中......同样的事情......
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" indent="yes"/>
<xsl:template match="/">
<xsl:for-each select="//p[@class='section']">
<xsl:variable name="start" select="." />
<xsl:value-of select="text()"/><xsl:text> </xsl:text>
<xsl:for-each select="following-sibling::p[@class='subsection'][preceding-sibling::p[@class='section'][1]=$start]">
<xsl:text>	</xsl:text><xsl:value-of select="text()"/><xsl:text> </xsl:text>
</xsl:for-each>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>