几年前我曾经使用过xslt,虽然我不想要专家,但是我可以编写基本的转换。我遇到了一些我不了解的问题。
在这里,我尝试从foxml记录中提取Dublin Core记录。 xml中的Dublin Core记录,而foxml基本上是将许多xml记录分组的xml标准。
这是我的xml:
<?xml version="1.0" encoding="UTF-8"?>
<foxml:digitalObject VERSION="1.1" PID="vital:26113"
xmlns:foxml="info:fedora/fedora-system:def/foxml#"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="info:fedora/fedora-system:def/foxml# http://www.fedora.info/definitions/1/0/foxml1-1.xsd">
<foxml:objectProperties>
<foxml:property NAME="info:fedora/fedora-system:def/model#state" VALUE="Active"/>
<foxml:property NAME="info:fedora/fedora-system:def/model#label" VALUE="DCity/DCCPC_DC.xml"/>
<foxml:property NAME="info:fedora/fedora-system:def/model#ownerId" VALUE=""/>
<foxml:property NAME="info:fedora/fedora-system:def/model#createdDate"
VALUE="2016-09-06T19:49:51.257Z"/>
<foxml:property NAME="info:fedora/fedora-system:def/view#lastModifiedDate"
VALUE="2016-09-27T13:23:10.950Z"/>
<foxml:extproperty NAME="http://www.w3.org/1999/02/22-rdf-syntax-ns#type" VALUE="FedoraObject"/>
<foxml:extproperty NAME="info:fedora/fedora-system:def/model#contentModel" VALUE=""/>
</foxml:objectProperties>
<foxml:datastream ID="DC" STATE="A" CONTROL_GROUP="X" VERSIONABLE="true">
<foxml:datastreamVersion ID="DC.0" LABEL="Dublin Core for this Record"
CREATED="2016-09-06T19:49:51.290Z" MIMETYPE="text/xml"
FORMAT_URI="http://www.openarchives.org/OAI/2.0/oai_dc/" SIZE="653">
<foxml:xmlContent>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>St. Patricks</dc:title>
<dc:creator>Mary Mooney</dc:creator>
<dc:publisher>Publisher</dc:publisher>
<dc:format>Photograph</dc:format>
<dc:identifier>123456</dc:identifier>
<dc:identifier>100.jpg</dc:identifier>
<dc:coverage>1984</dc:coverage>
<dc:rights>Publisher</dc:rights>
</oai_dc:dc>
</foxml:xmlContent>
</foxml:datastreamVersion>
<foxml:datastreamVersion ID="DC.1" LABEL="Dublin Core for this Record"
CREATED="2016-09-27T13:23:10.894Z" MIMETYPE="text/xml"
FORMAT_URI="http://www.openarchives.org/OAI/2.0/oai_dc/" SIZE="653">
<foxml:xmlContent>
<oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>St Audoen's</dc:title>
<dc:creator>William Mooney</dc:creator>
<dc:publisher>Publisher</dc:publisher>
<dc:format>Photograph</dc:format>
<dc:identifier>10987654</dc:identifier>
<dc:identifier>200.jpg</dc:identifier>
<dc:coverage>1984</dc:coverage>
<dc:rights>Publisher</dc:rights>
</oai_dc:dc>
</foxml:xmlContent>
</foxml:datastreamVersion>
</foxml:datastream>
</foxml:digitalObject>
这是我的xslt
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:audit="info:fedora/fedora-system:def/audit#" xmlns:premis="http://www.loc.gov/standards/premis/v1"
exclude-result-prefixes="xs"
version="2"
xmlns:foxml="info:fedora/fedora-system:def/foxml#"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="info:fedora/fedora-system:def/foxml# http://www.fedora.info/definitions/1/0/foxml1-1.xsd">
<xsl:output method="xml" indent="yes" name="xml"/>
<xsl:template match="/foxml:digitalObject/foxml:datastreamVersion[@ID eq DC.1]/foxml:xmlContent">
<metadata>
<xsl:value-of select="oai_dc:dc"/>
<xsl:copy-of select="."/>
</metadata>
</xsl:template>
</xsl:stylesheet>
我希望返回ID = DC.1的foxml:datastreamVersion的DC部分。相反,我得到以下信息:
<?xml version="1.0" encoding="UTF-8"?>
St. Patricks
Mary Mooney
Publisher
Photograph
123456
100.jpg
1984
Publisher
St Audoen's
William Mooney
Publisher
Photograph
10987654
200.jpg
1984
Publisher
所以我有两个明显的问题。
为什么要从与我选择的属性不匹配的节点中选择材料?
为什么只返回文本,而不返回随附的元素标签等?
我正在将oXygen 19.1与Saxon-EE.9.7.0.19变压器一起使用。
答案 0 :(得分:2)
首先,您的匹配表达式有问题,应该是这样...
/foxml:digitalObject/foxml:datastream/foxml:datastreamVersion[@ID eq 'DC.1']/foxml:xmlContent
您错过了路径中的foxml:datastream
。同样,“ DC.1”也需要放在撇号中,以使其成为字符串,而不是元素名称。
但是,在回答“为什么要从与属性不匹配的节点中选择材质”这个问题的答案是“由于XSLT的Built-In Templates”
XSLT开始处理时,它将寻找与文档节点/
相匹配的模板。 XSLT中没有这样的模板,因此默认模板会生效。实际上,这等效于XSLT中有这两个模板
<xsl:template match="*|/">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="text()|@*">
<xsl:value-of select="."/>
</xsl:template>
这些会跳过元素,但是会在找到它的地方输出文本,这会导致所有其他文本被输出。要停止这种情况,请将此模板添加到您的XSLT
<xsl:template match="node()">
<xsl:apply-templates />
</xsl:template>
(在XSLT 3.0中,改为执行<xsl:mode on-no-match="shallow-skip" />
)
对于第二个问题,当模板匹配时,您执行<xsl:value-of select="oai_dc:dc"/>
,并输出所有后代文本节点。您应该改用xsl:copy-of
。
尝试使用此XSLT
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:audit="info:fedora/fedora-system:def/audit#" xmlns:premis="http://www.loc.gov/standards/premis/v1"
exclude-result-prefixes="xs dc oai_dc audit premis foxml xsi"
version="2"
xmlns:foxml="info:fedora/fedora-system:def/foxml#"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="info:fedora/fedora-system:def/foxml# http://www.fedora.info/definitions/1/0/foxml1-1.xsd">
<xsl:output method="xml" indent="yes" name="xml"/>
<xsl:template match="node()">
<xsl:apply-templates />
</xsl:template>
<xsl:template match="/foxml:digitalObject/foxml:datastream/foxml:datastreamVersion[@ID eq 'DC.1']/foxml:xmlContent">
<metadata>
<xsl:copy-of select="oai_dc:dc"/>
</metadata>
</xsl:template>
</xsl:stylesheet>
或者,只需匹配文档节点,然后使用select定位要复制的节点即可。
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:audit="info:fedora/fedora-system:def/audit#" xmlns:premis="http://www.loc.gov/standards/premis/v1"
exclude-result-prefixes="xs dc oai_dc audit premis foxml xsi"
version="2"
xmlns:foxml="info:fedora/fedora-system:def/foxml#"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="info:fedora/fedora-system:def/foxml# http://www.fedora.info/definitions/1/0/foxml1-1.xsd">
<xsl:output method="xml" indent="yes" name="xml"/>
<xsl:template match="/">
<xsl:apply-templates select="foxml:digitalObject/foxml:datastream/foxml:datastreamVersion[@ID eq 'DC.1']/foxml:xmlContent" />
</xsl:template>
<xsl:template match="foxml:xmlContent">
<metadata>
<xsl:copy-of select="oai_dc:dc"/>
</metadata>
</xsl:template>
</xsl:stylesheet>