Question

使用此online XPath tester我尝试从以下XML文档中提取所有<subject>元素。

<recordData>
<srw_dc:dc xmlns:srw_dc="info:srw/schema/1/dc-schema" xmlns="http://purl.org/dc/elements/1.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="info:srw/schema/1/dc-schema http://www.loc.gov/standards/sru/resources/dc-schema.xsd">
<title>Classical philosophy : a contemporary introduction</title>
<creator>Shields, Christopher John (NO-TrBIB)x99040447</creator>
<type>text</type>
<publisher>London Routledge</publisher>
<date>2003</date>
<language>eng</language>
<subject>Gresk oldtid</subject>
<subject>Filosofi</subject>
<subject>Antikken</subject>
<subject>filosofi</subject>
<subject>antikken</subject>
<relation/>
<identifier>http://content.bibsys.no/content/?type=descr_publ_brief&amp;isbn=0415233976</identifier>
<identifier>http://content.bibsys.no/content/?type=descr_publ_full&amp;isbn=0415233976</identifier>
<identifier>http://content.bibsys.no/content/?type=toc&amp;isbn=0415233976</identifier>
<identifier>URN:ISBN:0415233976</identifier>
<identifier>URN:ISBN:0415233984</identifier>
</srw_dc:dc>
</recordData>

我可以使用以下两个XPath表达式提取内部元素的所有内部元素和所有文本内容：*/*/*，*/*/*/text()。

但我不能只选择特定类型的广告：*/*/subject，*/*/subject/text()。

为什么？

Answer 1

这是因为XML文档具有默认命名空间：

xmlns="http://purl.org/dc/elements/1.1/"

像*/*/subject这样的XPath表达式查找名为“subject”且位于 no 名称空间中的元素。以下是解决问题的两种方法：

如果嵌套级别实际上无关紧要，则将XPath表达式更改为/*/*/*[local-name() = 'subject']甚至//*[local-name() = 'subject']
如果您仅使用在线XPath测试程序进行测试（这是您应该使用它们的唯一方法），请在您的编程语言中查找可以处理名称空间的XPath库。然后将名称空间URI与前缀一起注册，并将subject替换为prefix:subject。这是解决问题的正确方法。

无法选择特定类型的元素

1 个答案: