Question

我有一个用
替换 \n \r 和 实际换行 的模板:

<!-- convert \n and \r to <br /> and preserve real line breaks -->
    <xsl:template name="lineBreak">
        <xsl:param name="field" select="."/>
        <!-- current element of no value is specified -->
        <xsl:variable name="br">
            <br/>
        </xsl:variable>
        <xsl:variable name="nl">\\n</xsl:variable>
        <xsl:variable name="cr">\\r</xsl:variable>
        <xsl:variable name='newline'>
            <xsl:text>&#xa;</xsl:text>
        </xsl:variable>
        <xsl:analyze-string select="$field" regex="{$cr}|{$nl}|{$newline}">
            <xsl:matching-substring>
                <xsl:sequence select="$br"/>
            </xsl:matching-substring>
            <xsl:non-matching-substring>
                <xsl:value-of select="."/>
            </xsl:non-matching-substring>
        </xsl:analyze-string>
    </xsl:template>

我这样调用模板：

<xsl:call-template name="lineBreak">
<xsl:with-param name="field" select="artist"/>
</xsl:call-template>

<xsl:call-template name="lineBreak">
<xsl:with-param name="field" select="artist/@initials"/>
</xsl:call-template>

我正在使用这个输入：

<?xml version="1.0" encoding="UTF-8"?>
<catalog>
    <cd>
        <title>Empire Burlesque</title>
        <artist initials="L1:BD \nL2:line2 
        L3:line 3">Bob Dylan \rIs cool
        LINE 3</artist>
        <country>USA</country>
        <company>Columbia</company>
        <price>10.90</price>
        <year>1985</year>
    </cd>
    <cd>
        <title>Hide your heart</title>
        <artist initials="BT">Bonnie Tyler</artist>
        <country>UK</country>
        <company>CBS Records</company>
        <price>9.90</price>
        <year>1988</year>
    </cd>
</catalog>

您可以从输出中看到，除了属性中真正的新行之外，它正在正确替换所有内容。每个“L3”或“LINE 3”都是由“真正的新行”而不是“\n”或“\r”产生的。

HTML 输出：

<html>
    <body>
        <h2>My CD Collection</h2>
        <table border="1">
            <tr bgcolor="#9acd32">
                <th style="text-align:left">Title</th>
                <th style="text-align:left">Artist</th>
                <th style="text-align:left">Initials</th>
            </tr>
            <tr>
                <td>Empire Burlesque</td>
                <td>Bob Dylan <br>Is cool<br>       LINE 3</td>
                <td>L1:BD <br>L2:line2    L3:line 3</td>
            </tr>
            <tr>
                <td>Hide your heart</td>
                <td>Bonnie Tyler</td>
                <td>BT</td>
            </tr>
        </table>
    </body>
</html>

Answer 1

这很可能是由于Attribute-Value Normalization：

<块引用>

在将属性值传递给应用程序或检查其有效性之前，XML 处理器必须按如下方式对其进行规范化：

通过将引用的字符附加到属性值来处理字符引用
通过递归处理实体的替换文本来处理实体引用
空白字符（#x20、#xD、#xA、#x9）通过将#x20附加到规范化值来处理，除了“#xD#xA”只附加一个#x20作为外部解析实体的一部分的序列或内部解析实体的文字实体值
通过将其他字符附加到规范化值来处理其他字符

如果声明的值不是 CDATA，那么 XML 处理器必须通过丢弃任何前导和尾随空格 (#x20) 字符，并将空格 (#x20) 字符序列替换为单个空格来进一步处理规范化的属性值(#x20) 字符。

未读取声明的所有属性都应由非验证解析器视为声明为 CDATA。

所以属性值在被你的模板处理之前被“规范化”。这意味着换行符在与您的 RegEx 匹配之前被转换为 #x20 空格。您可以仅通过执行身份模板来验证这一点。

我不知道任何解决方法。

Answer 2

如果您想在属性值中使用“真正的”换行符，您需要将其标记为字符引用，例如att="Line 1.
Line2"，这样属性值应该包含一个换行符。并非每个解析器都可能实现这一点，但是当您将问题标记为 XSLT 2.0 时，我假设它可能是 Java 平台上的 Saxon，这样我认为您通常有一个底层 XML 解析器可以按照规范要求处理属性值中的字符引用

但是，这种方法有点脆弱，因为对 XML 的任何解析/序列化/编辑步骤都可能涉及不保留字符引用的解析器/序列化器/编辑器。

XSL 模板适用于标记但不适用于属性

2 个答案: