Question

我有一个我认为相当简单的问题，但我无法绕过它。

我有一个包含多个text元素的大型XML。我想在它们上匹配一个模板，但这会删除带有属性的p元素，这只会导致纯文本。该文件应由“！”标记。并保留...。它还删除了我中一半的句子，我也不想这样做。

输入：

<root>
 <text>
    <body>
        <div>
            <p facs="001">Hello Guys! This is my example! Thanks for your time!</p>
        </div>
        <div>
            <p facs="002">Some more text! And a little more!</p>
        </div>
        <div>
            <p facs="003">Here as well! See you later!</p>
        </div>
    </body>
 </text>
</root>

我的XSLT

<xsl:stylesheet version="2.0" exclude-result-prefixes="xs"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xsl:output method="xml" indent="yes"/>

    <xsl:template match="node() | @*">
        <xsl:copy>
            <xsl:apply-templates select="@* |node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="/root/text/body/div/p">
            <xsl:variable name="tokens" select="tokenize(text(),'!')" as="xs:string*"/>
            <xsl:variable name="words" select="remove($tokens, 1)" as="xs:string*"/>
            <xsl:for-each select="1 to xs:integer(floor(count($words) div 1))">
                <xsl:variable name="vIndex" select="(.)" as="xs:integer"/>
                <w><xsl:attribute name="n"
                    select="position()"/>
                    <xsl:value-of select="normalize-space($words[$vIndex])"/>
                </w>
            </xsl:for-each>
    </xsl:template>
</xsl:stylesheet>

我的输出：

 <root>
       <text>
          <body>
            <div>
                   <w n="1">This is my example</w>
                   <w n="2">Thanks for your time</w>
                   <w n="3"/>
            </div>
            <div>
                   <w n="1">And a little more</w>
                   <w n="2"/>
            </div>
            <div>
                   <w n="1">See you later</w>
                   <w n="2"/>
            </div>
          </body>
       </text>
    </root>

我想要输出的内容：

 <root>
       <text>
          <body>
            <div>
                <p facs="001">
                   <w n="1">Hello Guys</w>
                   <w n="2">This is my example</w>
                   <w n="3">Thanks for your time</w>
                </p>
            </div>
            <div>
                <p facs="002">
                   <w n="1">Some more Text</w>
                   <w n="2">And a little more</w>
                </p>
            </div>
            <div>
                <p facs="003">
                   <w n="1">Here as well</w>
                   <w n="2">See you later</w>
                </p>
            </div>
          </body>
       </text>
    </root>

此外，尽管没有必要，我想知道是否有办法保持“！”我被标记为。我该如何保存它们？

简而言之： a）我不想删除我的facs属性 b）我不想失去第一句话 c）我怎样才能保存我标记的字符？在这个例子中“！”

非常感谢！

Answer 1

要保留p元素，您只需在其模板中添加xsl:copy，例如

<xsl:template match="/root/text/body/div/p">
   <xsl:copy>
     <xsl:copy-of select="@*"/>
        <xsl:variable name="tokens" select="tokenize(text(),'!')" as="xs:string*"/>
        <xsl:variable name="words" select="remove($tokens, 1)" as="xs:string*"/>
        <xsl:for-each select="1 to xs:integer(floor(count($words) div 1))">
            <xsl:variable name="vIndex" select="(.)" as="xs:integer"/>
            <w><xsl:attribute name="n"
                select="position()"/>
                <xsl:value-of select="normalize-space($words[$vIndex])"/>
            </w>
        </xsl:for-each>
   </xsl:copy>
</xsl:template>

然后我会使用

<xsl:template match="/root/text/body/div/p">
   <xsl:copy>
      <xsl:copy-of select="@*"/>
        <xsl:for-each select="tokenize(., '!')">
            <w n="{position()}"><xsl:value-of select="."/></w>
        </xsl:for-each>
   </xsl:copy>
</xsl:template>

如果您想保留感叹号，可能需要查看xsl:analyze-string而不是tokenize。

我现在有时间进行测试，似乎我们需要排除纯白色空间标记;这是完整的代码：

<xsl:stylesheet version="2.0" exclude-result-prefixes="xs"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xsl:output method="xml" indent="yes"/>

    <xsl:template match="node() | @*">
        <xsl:copy>
            <xsl:apply-templates select="@* |node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="/root/text/body/div/p">
       <xsl:copy>
          <xsl:copy-of select="@*"/>
            <xsl:for-each select="tokenize(., '!')[normalize-space()]">
                <w n="{position()}"><xsl:value-of select="."/></w>
            </xsl:for-each>
       </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

那样输入

<root>
 <text>
    <body>
        <div>
            <p facs="001">Hello Guys! This is my example! Thanks for your time!</p>
        </div>
        <div>
            <p facs="002">Some more text! And a little more!</p>
        </div>
        <div>
            <p facs="003">Here as well! See you later!</p>
        </div>
    </body>
 </text>
</root>

转换为结果

<root>
   <text>
      <body>
        <div>
            <p facs="001">
               <w n="1">Hello Guys</w>
               <w n="2"> This is my example</w>
               <w n="3"> Thanks for your time</w>
            </p>
        </div>
        <div>
            <p facs="002">
               <w n="1">Some more text</w>
               <w n="2"> And a little more</w>
            </p>
        </div>
        <div>
            <p facs="003">
               <w n="1">Here as well</w>
               <w n="2"> See you later</w>
            </p>
        </div>
      </body>
   </text>
</root>

匹配的p元素丢失，标记化我的句子

1 个答案: