Question

请建议如何根据第一个点将等式分成两部分。之前我根据来自michael.hor257k的BREAK评论文本获得了suggestion to Split the Equation，现在需要按时间段进行拆分。

XML：

<root>
    <body><sec><title>The sec 1</title><p>Text 1</p></sec></body>
    <inline-formula>
        <math display="inline">
            <mi>A</mi>
            <mn>4.651</mn>
            <mi>The next text</mi>
        </math>
    </inline-formula>
    <inline-formula>
        <math display="inline">
            <mrow>
                <mrow><mi>B</mi></mrow>
                <mrow><mn>4.651</mn></mrow>
            </mrow>
            <mi>The next text</mi>
        </math>
    </inline-formula>
</root>

XSLT：

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:variable name="root" select="//inline-formula/*" />

    <xsl:template match="/">
        <xsl:for-each select="//inline-formula">
                <xsl:for-each select="text()">
                    <xsl:if test="contains(., '.')">
                        <xsl:apply-templates select="$root">
                            <xsl:with-param name="i" select="." tunnel="yes"/>
                        </xsl:apply-templates>
                    </xsl:if>
                </xsl:for-each >
        </xsl:for-each>
    </xsl:template>

    <xsl:template match="@*|node()">
        <xsl:param name="i" tunnel="yes"/>
            <xsl:if test="descendant-or-self::text()[contains(., '.')]">
                <xsl:copy>
                    <xsl:apply-templates select="@*|node()"/>
                </xsl:copy>
            </xsl:if>
    </xsl:template>

</xsl:stylesheet>

必填结果：

<root>
    <body><sec><title>The sec 1</title><p>Text 1</p></sec></body>
    <inline-formula>
        <math display="inline">
            <mi>A</mi>
            <mn>4.</mn>
        </math>
    </inline-formula>
    <inline-formula>
        <math display="inline">
            <!--Text node, before dot is removed -->
            <mn>651</mn>
            <mi>The next text</mi>
        </math>
    </inline-formula>

    <inline-formula>
        <math display="inline">
            <mrow>
                <mrow><mi>B</mi></mrow>
                <mrow><mn>4.</mn></mrow>
            </mrow>
        </math>
    </inline-formula>
    <inline-formula>
        <math display="inline">
            <mrow>
                <!--Text node, before dot is removed -->
                <mrow><mn>651</mn></mrow>
            </mrow>
            <mi>The next text</mi>
        </math>
    </inline-formula>
</root>

Answer 1

我也想知道你为什么需要这样的转变，但这是一个可能的解决方案。这些规则对我来说并不清楚，例如。

mn

inline-formula

是否始终需要将mn的字符串值拆分为单独的元素？
您所说的分割应该在{em> .的值mn上进行，但多个点在单个mn中没有意义MathML中的元素

但抛开所有这些，或许通过两个单独的转换来解决问题更容易。第一个简单地分隔mn元素的内容：

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="mn[contains(.,'.')]">
        <xsl:for-each select="tokenize(.,'\.')">
            <mn>
                <xsl:value-of select="."/>
                <xsl:if test="position() = 1">
                    <xsl:text>.</xsl:text>
                </xsl:if>
            </mn>
        </xsl:for-each>
    </xsl:template>

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

中间结果是

<?xml version="1.0" encoding="UTF-8"?>
<root>
   <body>
      <sec>
         <title>The sec 1</title>
         <p>Text 1</p>
      </sec>
   </body>
   <inline-formula>
      <math display="inline">
         <mi>A</mi>
         <mn>4.</mn>
         <mn>651</mn>
         <mi>The next text</mi>
      </math>
   </inline-formula>
   <inline-formula>
      <math display="inline">
         <mrow>
            <mrow>
               <mi>B</mi>
            </mrow>
            <mrow>
               <mn>4.</mn>
               <mn>651</mn>
            </mrow>
         </mrow>
         <mi>The next text</mi>
      </math>
   </inline-formula>
</root>

然后，应用类似于以下的第二个转换。顺便说一句，似乎是一个使用特殊模式关键字#all和#current的好机会。

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="inline-formula[count(//mn) gt 1]">
        <xsl:apply-templates select="." mode="first"/>
        <xsl:apply-templates select="." mode="second"/>
    </xsl:template>

    <xsl:template match="mn[position() = 2] | mi[. = 'The next text']" mode="first"/>
    <xsl:template match="mi[. != 'The next text']" mode="second"/>

    <xsl:template match="mn[position() = 1]" mode="second">
        <xsl:comment>Text node, before dot is removed</xsl:comment>
    </xsl:template>


    <xsl:template match="@*|node()" mode="#all">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()" mode="#current"/>
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

，最终结果是

<?xml version="1.0" encoding="UTF-8"?>
<root>
   <body>
      <sec>
         <title>The sec 1</title>
         <p>Text 1</p>
      </sec>
   </body>
   <inline-formula>
      <math display="inline">
         <mi>A</mi>
         <mn>4.</mn>
      </math>
   </inline-formula>
   <inline-formula>
      <math display="inline"><!--Text node, before dot is removed-->
         <mn>651</mn>
         <mi>The next text</mi>
      </math>
   </inline-formula>
   <inline-formula>
      <math display="inline">
         <mrow>
            <mrow>
               <mi>B</mi>
            </mrow>
            <mrow>
               <mn>4.</mn>
            </mrow>
         </mrow>
      </math>
   </inline-formula>
   <inline-formula>
      <math display="inline">
         <mrow>
            <mrow/>
            <mrow><!--Text node, before dot is removed-->
               <mn>651</mn>
            </mrow>
         </mrow>
         <mi>The next text</mi>
      </math>
   </inline-formula>
</root>

结果包含一个空的mrow元素。如果重要，您可以添加另一个模板

<xsl:template match="mrow/mrow[not(mn)]" mode="second"/>

进行第二次转换，但同样不清楚应该如何处理空元素。

Answer 2

看看michael.hor257k提供的上一个问题的答案，你在这个问题中使用的XSLT有几个关键的区别。在前面的答案中，它会对评论进行拆分，它会迭代这些评论出现的次数

<xsl:for-each select="0 to count(//comment()[.='Break'])">

因此，在新的解决方案中，您需要迭代一个带有点的文本节点的次数：

<xsl:for-each select="0 to count(//text()[contains(., '.')])">

然后，在“身份”模板中，上一个答案检查当前节点下面的注释数量，以查看它是否被复制：

<xsl:if test="descendant-or-self::text()[count(preceding::comment()[.='Break'])=$i]">

这意味着，在新的解决方案中，您可以从写下这个开始：

<xsl:if test="descendant-or-self::text()[count(preceding::text()[contains(., '.')])=$i]">

但是，这并不完全正确，因为带有点的节点将被复制到拆分的第一部分，但拆分的第二部分根本不包含该节点。

实际需要的表达方式是：

<xsl:if test="descendant-or-self::text()[(count(preceding::text()[contains(., '.')])=($i - 1) and contains(., '.')) or count(preceding::text()[contains(., '.')])=$i]">

这会将包含点的节点复制到拆分的两个部分。然后，您需要一个全新的模板来实际拆分文本。

试试这个XSLT

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<xsl:variable name="root" select="/*" />

<xsl:template match="/*">
    <xsl:copy>
        <xsl:copy-of select="*[not(self::inline-formula)]" />
        <xsl:for-each select="0 to count(//text()[contains(., '.')])">
            <xsl:apply-templates select="$root/inline-formula">
                <xsl:with-param name="i" select="." tunnel="yes"/>
            </xsl:apply-templates>
        </xsl:for-each >
    </xsl:copy>
</xsl:template>

<xsl:template match="@*">
    <xsl:copy />
</xsl:template>

<xsl:template match="node()">
    <xsl:param name="i" tunnel="yes"/>
    <xsl:if test="descendant-or-self::text()[(count(preceding::text()[contains(., '.')])=($i - 1) and contains(., '.')) or count(preceding::text()[contains(., '.')])=$i]">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:if>
</xsl:template>

<xsl:template match="text()[contains(., '.')]">
    <xsl:param name="i" tunnel="yes"/>
    <xsl:choose>
        <xsl:when test="count(preceding::text()[contains(., '.')]) = $i">
            <xsl:value-of select="substring-before(., '.')" /><xsl:text>.</xsl:text>
        </xsl:when>
        <xsl:otherwise>
            <xsl:value-of select="substring-after(., '.')" />
        </xsl:otherwise>
    </xsl:choose>
</xsl:template>
</xsl:stylesheet>

根据第一个点（。）拆分等式

2 个答案: