XSL转换输出多次和其他混淆

时间:2015-06-22 15:26:32

标签: xml xslt xpath

我尝试使用模板化的标记来转换XML文档的一部分(主要是 HTML),以匹配特定的模式。我对XSLT缺乏经验(我真的只使用过xpath),而且在线文档很少,所以我很挣扎......

以下XML文档:

<?xml version="1.0" encoding="UTF-8"?>
<body>
    <content type="ontology/content/ImageSet" url="dff6df70-e454-11e4-0e5f-978e959e1c97" />
    <p>...</p>
    <p>...</p>
    <ul>
        <li>
            <content type="ontology/content/ImageSet" url="fee5c268-1675-11e5-1ef3-978e959e1689" />
            <h4>China urbanisation</h4>
            <br />
            <em>1.8m</em>
            <br />
            ...
        </li>
        <li>
            <content type="ontology/content/ImageSet" url="0023edbc-1676-11e5-1ef3-978e959e1689" />
            <h4>Ebola crisis</h4>
            <br />
            <em>$1bn</em>
            <br />
            ...
        </li>
        <li>
            <content type="ontology/content/ImageSet" url="015961e4-1676-11e5-1ef3-978e959e1689" />
            <h4>Fighting inequality</h4>
            <br />
            <em>$479m</em>
            <br />
            ...
        </li>
    </ul>
    <p>...</p>
    <p>...</p>
    <p>...</p>
</body>

我正在尝试应用此转换:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

    <xsl:output indent="yes" encoding="UTF-8" omit-xml-declaration="yes" />

    <xsl:template match="//ul/li/content[../h4][../em]">
        <ul class="breakout o-grid-row">
            <xsl:for-each select="../../li">
                <li class="breakout__item" data-o-grid-colspan="12 M6 L3" role="group">
                    <xsl:copy-of select="content" />
                    <div class="breakout__item-content">
                        <header>
                            <h3 class="breakout__item-headline"><xsl:value-of select="h4" /></h3>
                            <p class="breakout__item-subheading"><xsl:value-of select="em" /></p>
                        </header>
                        <p class="breakout__item-description"><xsl:value-of select="text()[normalize-space()]" /></p>
                    </div>
                </li>
            </xsl:for-each>
        </ul>
    </xsl:template>

</xsl:stylesheet>

这就是结果:

    ...
    ...


            <ul class="breakout o-grid-row">
    <li class="breakout__item" data-o-grid-colspan="12 M6 L3" role="group">
        <content type="ontology/content/ImageSet" url="fee5c268-1675-11e5-1ef3-978e959e1689"/>
        <div class="breakout__item-content">
            <header>
                <h3 class="breakout__item-headline">China urbanisation</h3>
                <p class="breakout__item-subheading">1.8m</p>
            </header>
            <p class="breakout__item-description">
            ...
        </p>
        </div>
    </li>
    <li class="breakout__item" data-o-grid-colspan="12 M6 L3" role="group">
        <content type="ontology/content/ImageSet" url="0023edbc-1676-11e5-1ef3-978e959e1689"/>
        <div class="breakout__item-content">
            <header>
                <h3 class="breakout__item-headline">Ebola crisis</h3>
                <p class="breakout__item-subheading">$1bn</p>
            </header>
            <p class="breakout__item-description">
            ...
        </p>
        </div>
    </li>
    <li class="breakout__item" data-o-grid-colspan="12 M6 L3" role="group">
        <content type="ontology/content/ImageSet" url="015961e4-1676-11e5-1ef3-978e959e1689"/>
        <div class="breakout__item-content">
            <header>
                <h3 class="breakout__item-headline">Fighting inequality</h3>
                <p class="breakout__item-subheading">$479m</p>
            </header>
            <p class="breakout__item-description">
            ...
        </p>
        </div>
    </li>
</ul>
            China urbanisation

            1.8m

            ...


            <ul class="breakout o-grid-row">
    <li class="breakout__item" data-o-grid-colspan="12 M6 L3" role="group">
        <content type="ontology/content/ImageSet" url="fee5c268-1675-11e5-1ef3-978e959e1689"/>
        <div class="breakout__item-content">
            <header>
                <h3 class="breakout__item-headline">China urbanisation</h3>
                <p class="breakout__item-subheading">1.8m</p>
            </header>
            <p class="breakout__item-description">
            ...
        </p>
        </div>
    </li>
    <li class="breakout__item" data-o-grid-colspan="12 M6 L3" role="group">
        <content type="ontology/content/ImageSet" url="0023edbc-1676-11e5-1ef3-978e959e1689"/>
        <div class="breakout__item-content">
            <header>
                <h3 class="breakout__item-headline">Ebola crisis</h3>
                <p class="breakout__item-subheading">$1bn</p>
            </header>
            <p class="breakout__item-description">
            ...
        </p>
        </div>
    </li>
    <li class="breakout__item" data-o-grid-colspan="12 M6 L3" role="group">
        <content type="ontology/content/ImageSet" url="015961e4-1676-11e5-1ef3-978e959e1689"/>
        <div class="breakout__item-content">
            <header>
                <h3 class="breakout__item-headline">Fighting inequality</h3>
                <p class="breakout__item-subheading">$479m</p>
            </header>
            <p class="breakout__item-description">
            ...
        </p>
        </div>
    </li>
</ul>
            Ebola crisis

            $1bn

            ...


            <ul class="breakout o-grid-row">
    <li class="breakout__item" data-o-grid-colspan="12 M6 L3" role="group">
        <content type="ontology/content/ImageSet" url="fee5c268-1675-11e5-1ef3-978e959e1689"/>
        <div class="breakout__item-content">
            <header>
                <h3 class="breakout__item-headline">China urbanisation</h3>
                <p class="breakout__item-subheading">1.8m</p>
            </header>
            <p class="breakout__item-description">
            ...
        </p>
        </div>
    </li>
    <li class="breakout__item" data-o-grid-colspan="12 M6 L3" role="group">
        <content type="ontology/content/ImageSet" url="0023edbc-1676-11e5-1ef3-978e959e1689"/>
        <div class="breakout__item-content">
            <header>
                <h3 class="breakout__item-headline">Ebola crisis</h3>
                <p class="breakout__item-subheading">$1bn</p>
            </header>
            <p class="breakout__item-description">
            ...
        </p>
        </div>
    </li>
    <li class="breakout__item" data-o-grid-colspan="12 M6 L3" role="group">
        <content type="ontology/content/ImageSet" url="015961e4-1676-11e5-1ef3-978e959e1689"/>
        <div class="breakout__item-content">
            <header>
                <h3 class="breakout__item-headline">Fighting inequality</h3>
                <p class="breakout__item-subheading">$479m</p>
            </header>
            <p class="breakout__item-description">
            ...
        </p>
        </div>
    </li>
</ul>
            Fighting inequality

            $479m

            ...


    ...
    ...
    ...

我无法理解这一点有两个原因:

  1. 为什么生成的模板已完整输出3次?
  2. 为什么文本内容已复制到结果文档而不是现有的标记结构?
  3. 任何帮助回答上述问题并向正确方向推进的帮助都表示赞赏。

    编辑:这是我试图实现的输出:

    <?xml version="1.0" encoding="UTF-8"?>
    <body>
        <content type="ontology/content/ImageSet" url="dff6df70-e454-11e4-0e5f-978e959e1c97" />
        <p>...</p>
        <p>...</p>
        <ul class="breakout o-grid-row">
            <li class="breakout__item" data-o-grid-colspan="12 M6 L3" role="group">
                <content type="ontology/content/ImageSet" url="fee5c268-1675-11e5-1ef3-978e959e1689"/>
                <div class="breakout__item-content">
                    <header>
                        <h3 class="breakout__item-headline">China urbanisation</h3>
                        <p class="breakout__item-subheading">1.8m</p>
                    </header>
                    <p class="breakout__item-description">
                    ...
                </p>
                </div>
            </li>
            <li class="breakout__item" data-o-grid-colspan="12 M6 L3" role="group">
                <content type="ontology/content/ImageSet" url="0023edbc-1676-11e5-1ef3-978e959e1689"/>
                <div class="breakout__item-content">
                    <header>
                        <h3 class="breakout__item-headline">Ebola crisis</h3>
                        <p class="breakout__item-subheading">$1bn</p>
                    </header>
                    <p class="breakout__item-description">
                    ...
                </p>
                </div>
            </li>
            <li class="breakout__item" data-o-grid-colspan="12 M6 L3" role="group">
                <content type="ontology/content/ImageSet" url="015961e4-1676-11e5-1ef3-978e959e1689"/>
                <div class="breakout__item-content">
                    <header>
                        <h3 class="breakout__item-headline">Fighting inequality</h3>
                        <p class="breakout__item-subheading">$479m</p>
                    </header>
                    <p class="breakout__item-description">
                    ...
                </p>
                </div>
            </li>
        </ul>
        <p>...</p>
        <p>...</p>
        <p>...</p>
    </body>
    

1 个答案:

答案 0 :(得分:0)

1)您的模板应用于3个元素,并且对于每个元素,循环遍历所有父元素li元素(是的,对于每个元素,请询问所有li元素,子元素当前content的祖父,每次都是3 li个元素。

2)因为这是默认模板规则对您不匹配的节点所做的事情。处理从文档节点开始,递归地递归所有元素,而不匹配的文本节点被复制到输出树,直到某些content元素与显式模板规则匹配。

以下可能解决了这两个问题(通过仅将模板应用于选定的content元素,并通过仅使用其自己的li父元素来解决每个元素:

<xsl:template match="/body">
   <xsl:apply-templates select="ul/li[h4][em]/content"/>
</xsl:template>

<xsl:template match="content">
   <ul class="breakout o-grid-row">
      <xsl:for-each select="..">
      ...

编辑:鉴于您的编辑,使用预期的输出,这里有一些更完整的内容(请参阅上面有关与您的差异的解释)。第一个模板默认复制所有内容,第二个模板表示对于包含li元素的content元素,可以执行以下操作:

<xsl:template match="node()">
   <xsl:copy>
      <xsl:copy-of select="@*"/>
      <xsl:apply-templates select="node()"/>
   </xsl:copy>
</xsl:template>

<xsl:template match="li[content]">
   <ul class="breakout o-grid-row">
      <li class="breakout__item" data-o-grid-colspan="12 M6 L3" role="group">
         <xsl:copy-of select="content"/>
         <div class="breakout__item-content">
            <header>
               <h3 class="breakout__item-headline">
                  <xsl:value-of select="h4"/>
               </h3>
               <p class="breakout__item-subheading">
                  <xsl:value-of select="em"/>
               </p>
            </header>
            <p class="breakout__item-description">
               <!-- not sure what you want here, so I kept the normalize-space -->
               <xsl:value-of select="normalize-space(.)"/>
            </p>
         </div>
      </li>
   </ul>
</xsl:template>

应用于您的示例输入,它提供了示例输出。