Question

我有以下xml代码：

<p>
    <media id="pc300220-scpwr.gif" print-rights="no"
        rights="licensed" type="photo">
        <title>Louis Pasteur</title>
        <credit>Granger Collection</credit>
    </media>
    <b>Louis</b> 
    <pronunciation>
        <word-term>
            <b>Pasteur</b> 
        </word-term>
    </pronunciation> (1822&ndash;1895) was a French chemist. He made major contributions
    to chemistry, medicine, and industry. His work has greatly benefited people. For
    example, he discovered that diseases spread through
    <definition>
        <word-term>bacteria </word-term>
        <word-definition>tiny living things</word-definition>
    </definition>. This discovery has saved many millions of lives.
</p>

以及以下XSLT段：

<xsl:template match="p|b|i">
    <xsl:copy>
        <xsl:apply-templates/>
    </xsl:copy>
</xsl:template>

实际上会生成类似

的输出

（18221895）

但我想要

（1822-1895）

那么有人可以帮助解释为什么＆amp; ndash不会被复制到生成的XML吗？

Answer 1

以下XSLT段：

<xsl:template match="p|b|i">
    <xsl:copy>
        <xsl:apply-templates/>
    </xsl:copy>
</xsl:template>

实际上会生成类似
的输出
（18221895）

但我想要

（1822-1895）

我不能重复这个问题。

使用此XML文档（为了格式正确而更正）：

<!DOCTYPE p [
 <!ENTITY ndash   "&#8211;">
]>
<p>
    <media id="pc300220-scpwr.gif" print-rights="no"
    rights="licensed" type="photo">
        <title>Louis Pasteur</title>
        <credit>Granger Collection</credit>
    </media>
    <b>Louis</b>
    <pronunciation>
        <word-term>
            <b>Pasteur</b>
        </word-term>
    </pronunciation> (1822&ndash;1895) was a French chemist. He made major contributions         to chemistry, medicine, and industry. His work has greatly benefited people. For         example, he discovered that diseases spread through
    <definition>
        <word-term>bacteria </word-term>
        <word-definition>tiny living things</word-definition>
    </definition>. This discovery has saved many millions of lives.
</p>

以及何时应用此转换：

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <xsl:template match="p|b|i">
        <xsl:copy>
            <xsl:apply-templates/>
        </xsl:copy>
    </xsl:template>
</xsl:stylesheet>

结果是：

    Louis Pasteur
    Granger Collection

<b>Louis</b>


        <b>Pasteur</b>

 (1822–1895) was a French chemist. He made major contributions         to chemistry, medicine, and industry. His work has greatly benefited people. For         example, he discovered that diseases spread through

    bacteria 
    tiny living things
. This discovery has saved many millions of lives.

Answer 2

Dimitre说XSLT代码没有问题是正确的。但是，这不是XSLT问题，这是一个解析问题。

您的文档包含实体–，它不是预定义的XML实体。因此，如果解析器不知道该实体的定义，则无法替换其值。这意味着您的XML有效如果它可以访问具有实体–定义的DTD。该DTD可能内部嵌入在XML文档中（如Dimitre的示例中），也可能在XML文档中引用的外部DTD中定义。您的代码没有任何DTD定义或引用，但我相信您只复制粘贴代码中的代码段，因此DTD被意外遗漏。

那么，究竟是什么导致了你的问题

即使实体定义可用，它仍然不意味着解析器必然会替换实体值。

XML 1.0建议说明：（ref：http://www.w3.org/TR/xml/#wf-entdeclared）

请注意，非验证处理器没有义务阅读和处理实体声明发生在参数实体或外部子集;对于这样的文件，规则必须声明一个实体是一个良好的形成约束只有在独立= '是'。

和:(参考：http://www.w3.org/TR/xml/#include-if-valid）

如果实体是外部的，那么处理器没有尝试验证XML文档，处理器可以，但不必包括实体的替换文本。如果一个非验证处理器没有包括替换文本，它必须通知应用程序承认，但没有读，实体。

目前尚不清楚您的整个文档实际上是否格式正确，但您的解析器确实解析了您的文档，似乎它删除了实体引用而未包含替换文本。因此1822–1895被解释为18221895。 XSLT处理器处理解析的数据模型，如果它不包含该破折号字符，则XSLT处理器无法将其复制到生成的XML中。

我建议您确保解析器可以访问定义了所有实体的DTD，并且可能还将解析器设置为验证模式。

XSLT不复制HTML实体的值

2 个答案:

那么，究竟是什么导致了你的问题