Question

早上好，

我遇到了一个包含CDATA代码的XML问题。如果我们有这个XML：

<?xml version="1.0" encoding="ISO-8859-1"?>
<character>
   <Body>
      <methodResult>
         <nodeOut>
            <![CDATA[  <film>Indiana Jones and the Kingdom of the Crystal Skull</film>]]>
         </nodeOut>
      </methodResult>
   </Body>
</character>

我们需要这样：

<film>Indiana Jones and the Kingdom of the Crystal Skull</film>

XSLT在哪里？我想只提取XML文件中的CDATA内容并删除其余内容。我使用的是XSLT 1.0。

谢谢！

Answer 1

这将产生XML：

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="xs"
    version="2.0">

    <!-- ignore this elements -->
    <xsl:template match="role|actor|part"/>

    <!-- get the remaining text and remove white-spaces -->
    <xsl:template match="text()">
        <xsl:value-of select="normalize-space(.)" disable-output-escaping="yes"/>
    </xsl:template>

</xsl:stylesheet>

输出：

<?xml version="1.0" encoding="UTF-8"?><film>Indiana Jones and the Kingdom of the Crystal Skull</film>

Answer 2

您可以使用将输出方法设置为text的转换，只需从name元素中提取文本节点。

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <xsl:output method="text" />

    <xsl:template match="node()|@*">
        <xsl:apply-templates select="node()|@*" />
    </xsl:template>

    <xsl:template select="name/text()">
        <xsl:value-of select="." />
    </xsl:template>

</xsl:stylesheet>

请注意，如果元素中有多个CDATA部分，则会失败，如果输入中有多个name，则需要创建某种根元素。你的CDATA部分也有领先的空白，所以我建议你修剪输出。在XSLT中你可以做的一种方法是使用函数normalize-space()，但它也会影响CDATA“xml”的内容。此方法也没有XML序言，因此如果输出被视为有效XML取决于您将其提供给它。

但这是一个很好的起点。

Answer 3

XSLT 3.0中可以使用清洁解决方案（由Saxon 9.7或Exselt支持）

<xsl:template match="/">
  <xsl:copy-of select="parse-xml-fragment(character/name/text()[last()])"/>
</xsl:template>

请参阅https://www.w3.org/TR/xpath-functions-30/#func-parse-xml-fragment。

如何使用xsl从xml节点获取CDATA并转换为新的XML？

3 个答案: