为什么我的xslt选择我未指定的节点的子节点?

时间:2018-10-11 13:58:04

标签: xml xslt saxon oxygenxml

几年前我曾经使用过xslt,虽然我不想要专家,但是我可以编写基本的转换。我遇到了一些我不了解的问题。

在这里,我尝试从foxml记录中提取Dublin Core记录。 xml中的Dublin Core记录,而foxml基本上是将许多xml记录分组的xml标准。

这是我的xml:

    <?xml version="1.0" encoding="UTF-8"?>
<foxml:digitalObject VERSION="1.1" PID="vital:26113"
xmlns:foxml="info:fedora/fedora-system:def/foxml#"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="info:fedora/fedora-system:def/foxml# http://www.fedora.info/definitions/1/0/foxml1-1.xsd">
<foxml:objectProperties>
    <foxml:property NAME="info:fedora/fedora-system:def/model#state" VALUE="Active"/>
    <foxml:property NAME="info:fedora/fedora-system:def/model#label" VALUE="DCity/DCCPC_DC.xml"/>
    <foxml:property NAME="info:fedora/fedora-system:def/model#ownerId" VALUE=""/>
    <foxml:property NAME="info:fedora/fedora-system:def/model#createdDate"
        VALUE="2016-09-06T19:49:51.257Z"/>
    <foxml:property NAME="info:fedora/fedora-system:def/view#lastModifiedDate"
        VALUE="2016-09-27T13:23:10.950Z"/>
    <foxml:extproperty NAME="http://www.w3.org/1999/02/22-rdf-syntax-ns#type" VALUE="FedoraObject"/>
    <foxml:extproperty NAME="info:fedora/fedora-system:def/model#contentModel" VALUE=""/>
</foxml:objectProperties>
<foxml:datastream ID="DC" STATE="A" CONTROL_GROUP="X" VERSIONABLE="true">
    <foxml:datastreamVersion ID="DC.0" LABEL="Dublin Core for this Record"
        CREATED="2016-09-06T19:49:51.290Z" MIMETYPE="text/xml"
        FORMAT_URI="http://www.openarchives.org/OAI/2.0/oai_dc/" SIZE="653">
        <foxml:xmlContent>
            <oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
                xmlns:dc="http://purl.org/dc/elements/1.1/"
                xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
                <dc:title>St. Patricks</dc:title>
                <dc:creator>Mary Mooney</dc:creator>
                <dc:publisher>Publisher</dc:publisher>
                <dc:format>Photograph</dc:format>
                <dc:identifier>123456</dc:identifier>
                <dc:identifier>100.jpg</dc:identifier>
                <dc:coverage>1984</dc:coverage>
                <dc:rights>Publisher</dc:rights>
            </oai_dc:dc>
        </foxml:xmlContent>
    </foxml:datastreamVersion>
    <foxml:datastreamVersion ID="DC.1" LABEL="Dublin Core for this Record"
        CREATED="2016-09-27T13:23:10.894Z" MIMETYPE="text/xml"
        FORMAT_URI="http://www.openarchives.org/OAI/2.0/oai_dc/" SIZE="653">
        <foxml:xmlContent>
            <oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/"
                xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
                xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
                <dc:title>St Audoen&apos;s</dc:title>
                <dc:creator>William Mooney</dc:creator>
                <dc:publisher>Publisher</dc:publisher>
                <dc:format>Photograph</dc:format>
                <dc:identifier>10987654</dc:identifier>
                <dc:identifier>200.jpg</dc:identifier>
                <dc:coverage>1984</dc:coverage>
                <dc:rights>Publisher</dc:rights>
            </oai_dc:dc>
        </foxml:xmlContent>
    </foxml:datastreamVersion>
</foxml:datastream>
</foxml:digitalObject>

这是我的xslt

    <?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:audit="info:fedora/fedora-system:def/audit#" xmlns:premis="http://www.loc.gov/standards/premis/v1"
exclude-result-prefixes="xs"
version="2"
xmlns:foxml="info:fedora/fedora-system:def/foxml#"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="info:fedora/fedora-system:def/foxml# http://www.fedora.info/definitions/1/0/foxml1-1.xsd">
    <xsl:output method="xml" indent="yes" name="xml"/>
   <xsl:template match="/foxml:digitalObject/foxml:datastreamVersion[@ID eq DC.1]/foxml:xmlContent">
       <metadata>
         <xsl:value-of select="oai_dc:dc"/>
               <xsl:copy-of select="."/>
            </metadata>
        </xsl:template>
    </xsl:stylesheet>

我希望返回ID = DC.1的foxml:datastreamVersion的DC部分。相反,我得到以下信息:

<?xml version="1.0" encoding="UTF-8"?>

                St. Patricks
                Mary Mooney
                Publisher
                Photograph
                123456
                100.jpg
                1984
                Publisher


                St Audoen's
                William Mooney
                Publisher
                Photograph
                10987654
                200.jpg
                1984
                Publisher

所以我有两个明显的问题。

  1. 为什么要从与我选择的属性不匹配的节点中选择材料?

  2. 为什么只返回文本,而不返回随附的元素标签等?

我正在将oXygen 19.1与Saxon-EE.9.7.0.19变压器一起使用。

1 个答案:

答案 0 :(得分:2)

首先,您的匹配表达式有问题,应该是这样...

 /foxml:digitalObject/foxml:datastream/foxml:datastreamVersion[@ID eq 'DC.1']/foxml:xmlContent

您错过了路径中的foxml:datastream。同样,“ DC.1”也需要放在撇号中,以使其成为字符串,而不是元素名称。

但是,在回答“为什么要从与属性不匹配的节点中选择材质”这个问题的答案是“由于XSLT的Built-In Templates

XSLT开始处理时,它将寻找与文档节点/相匹配的模板。 XSLT中没有这样的模板,因此默认模板会生效。实际上,这等效于XSLT中有这两个模板

<xsl:template match="*|/">
  <xsl:apply-templates/>
</xsl:template>

<xsl:template match="text()|@*">
  <xsl:value-of select="."/>
</xsl:template>

这些会跳过元素,但是会在找到它的地方输出文本,这会导致所有其他文本被输出。要停止这种情况,请将此模板添加到您的XSLT

<xsl:template match="node()">
  <xsl:apply-templates />
</xsl:template>

(在XSLT 3.0中,改为执行<xsl:mode on-no-match="shallow-skip" />

对于第二个问题,当模板匹配时,您执行<xsl:value-of select="oai_dc:dc"/>,并输出所有后代文本节点。您应该改用xsl:copy-of

尝试使用此XSLT

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:audit="info:fedora/fedora-system:def/audit#" xmlns:premis="http://www.loc.gov/standards/premis/v1"
    exclude-result-prefixes="xs dc oai_dc audit premis foxml xsi"
    version="2"
    xmlns:foxml="info:fedora/fedora-system:def/foxml#"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="info:fedora/fedora-system:def/foxml# http://www.fedora.info/definitions/1/0/foxml1-1.xsd">

  <xsl:output method="xml" indent="yes" name="xml"/>

  <xsl:template match="node()">
    <xsl:apply-templates />
  </xsl:template>

  <xsl:template match="/foxml:digitalObject/foxml:datastream/foxml:datastreamVersion[@ID eq 'DC.1']/foxml:xmlContent">
    <metadata>
      <xsl:copy-of select="oai_dc:dc"/>
    </metadata>
  </xsl:template>
</xsl:stylesheet>

或者,只需匹配文档节点,然后使用select定位要复制的节点即可。

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:audit="info:fedora/fedora-system:def/audit#" xmlns:premis="http://www.loc.gov/standards/premis/v1"
    exclude-result-prefixes="xs dc oai_dc audit premis foxml xsi"
    version="2"
    xmlns:foxml="info:fedora/fedora-system:def/foxml#"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="info:fedora/fedora-system:def/foxml# http://www.fedora.info/definitions/1/0/foxml1-1.xsd">

  <xsl:output method="xml" indent="yes" name="xml"/>

  <xsl:template match="/">
    <xsl:apply-templates select="foxml:digitalObject/foxml:datastream/foxml:datastreamVersion[@ID eq 'DC.1']/foxml:xmlContent" />
  </xsl:template>

  <xsl:template match="foxml:xmlContent">
    <metadata>
      <xsl:copy-of select="oai_dc:dc"/>
    </metadata>
  </xsl:template>
</xsl:stylesheet>