作为Java 6应用程序的一部分,我想在XML文档中找到所有名称空间声明,包括任何重复项。
编辑:根据Martin的要求,这是我正在使用的Java代码:
XPathFactory xPathFactory = XPathFactory.newInstance();
XPath xPath = xPathFactory.newXPath();
XPathExpression xPathExpression = xPathExpression = xPath.compile("//namespace::*");
NodeList nodeList = (NodeList) xPathExpression.evaluate(xmlDomDocument, XPathConstants.NODESET);
假设我有这个XML文档:
<?xml version="1.0" encoding="UTF-8"?>
<root xmlns:ele="element.com" xmlns:att="attribute.com" xmlns:txt="textnode.com">
<ele:one>a</ele:one>
<two att:c="d">e</two>
<three>txt:f</three>
</root>
要查找所有名称空间声明,我将此xPath语句应用于XML文档使用xPath 1.0 :
//namespace::*
它找到4个名称空间声明,这是我所期望的(和期望):
/root[1]/@xmlns:att - attribute.com
/root[1]/@xmlns:ele - element.com
/root[1]/@xmlns:txt - textnode.com
/root[1]/@xmlns:xml - http://www.w3.org/XML/1998/namespace
但是,如果我使用xPath 2.0 更改为,那么我将获得16个名称空间声明(以前的每个声明4次),这不是我期望的(或者希望的):
/root[1]/@xmlns:xml - http://www.w3.org/XML/1998/namespace
/root[1]/@xmlns:att - attribute.com
/root[1]/@xmlns:ele - element.com
/root[1]/@xmlns:txt - textnode.com
/root[1]/@xmlns:xml - http://www.w3.org/XML/1998/namespace
/root[1]/@xmlns:att - attribute.com
/root[1]/@xmlns:ele - element.com
/root[1]/@xmlns:txt - textnode.com
/root[1]/@xmlns:xml - http://www.w3.org/XML/1998/namespace
/root[1]/@xmlns:att - attribute.com
/root[1]/@xmlns:ele - element.com
/root[1]/@xmlns:txt - textnode.com
/root[1]/@xmlns:xml - http://www.w3.org/XML/1998/namespace
/root[1]/@xmlns:att - attribute.com
/root[1]/@xmlns:ele - element.com
/root[1]/@xmlns:txt - textnode.com
即使我使用xPath语句的非缩写版本,也会看到同样的差异:
/descendant-or-self::node()/namespace::*
在oXygen中测试的各种XML解析器(LIBXML,MSXML.NET,Saxon)中都可以看到它。 (编辑:正如我在评论中稍后提到的,这种说法不正确。虽然我认为我正在测试各种XML解析器,但我真的不是。)
问题#1:为什么从xPath 1.0到xPath 2.0的区别?
问题2:使用xPath 2.0获得所需结果是否可能/合理?
提示:使用xPath 2.0中的distinct-values()
函数将不返回所需的结果,因为我想要所有名称空间声明,即使同一名称空间被声明两次。例如,请考虑以下XML文档:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<bar:one xmlns:bar="http://www.bar.com">alpha</bar:one>
<bar:two xmlns:bar="http://www.bar.com">bravo</bar:two>
</root>
期望的结果是:
/root[1]/@xmlns:xml - http://www.w3.org/XML/1998/namespace
/root[1]/bar:one[1]/@xmlns:bar - http://www.bar.com
/root[1]/bar:two[1]/@xmlns:bar - http://www.bar.com
答案 0 :(得分:7)
我认为这将获得所有名称空间,没有任何重复:
for $i in 1 to count(//namespace::*) return
if (empty(index-of((//namespace::*)[position() = (1 to ($i - 1))][name() = name((//namespace::*)[$i])], (//namespace::*)[$i])))
then (//namespace::*)[$i]
else ()
答案 1 :(得分:4)
要查找所有名称空间声明,我将此xPath语句应用于 使用xPath 1.0的XML文档:
//namespace::* It finds 4 namespace declarations, which is what I expect (and desire): /root[1]/@xmlns:att - attribute.com /root[1]/@xmlns:ele - element.com /root[1]/@xmlns:txt - textnode.com /root[1]/@xmlns:xml - http://www.w3.org/XML/1998/namespace
您正在使用不合规(错误)的XPath 1.0实施。
我得到的所有XSLT 1.0处理器都有不同的正确结果。此转换(仅评估XPath表达式并为每个选定的命名空间节点打印一行):
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<xsl:for-each select="//namespace::*">
<xsl:value-of select="concat(name(), ': ', ., '
')"/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
应用于提供的XML文档时:
<root xmlns:ele="element.com" xmlns:att="attribute.com" xmlns:txt="textnode.com">
<ele:one>a</ele:one>
<two att:c="d">e</two>
<three>txt:f</three>
</root>
会产生正确的结果:
xml: http://www.w3.org/XML/1998/namespace
ele: element.com
att: attribute.com
txt: textnode.com
xml: http://www.w3.org/XML/1998/namespace
ele: element.com
att: attribute.com
txt: textnode.com
xml: http://www.w3.org/XML/1998/namespace
ele: element.com
att: attribute.com
txt: textnode.com
xml: http://www.w3.org/XML/1998/namespace
ele: element.com
att: attribute.com
txt: textnode.com
所有这些XSLT 1.0和XSLT 2.0处理器的:
MSXML3,MSXML4,MSXML6,.NET XslCompiledTransform,.NET XslTransform,Altova(XML SPY),Saxon 6.5.4,Saxon 9.1.07,XQSharp。
这是一个简短的C#程序,它确认在.NET中选择的节点数是16:
namespace TestNamespaces
{
using System;
using System.IO;
using System.Xml.XPath;
class Test
{
static void Main(string[] args)
{
string xml =
@"<root xmlns:ele='element.com' xmlns:att='attribute.com' xmlns:txt='textnode.com'>
<ele:one>a</ele:one>
<two att:c='d'>e</two>
<three>txt:f</three>
</root>";
XPathDocument doc = new XPathDocument(new StringReader(xml));
double count =
(double) doc.CreateNavigator().Evaluate("count(//namespace::*)");
Console.WriteLine(count);
}
}
}
结果是:
16
。
<强>更新强>:
这是一个XPath 2.0表达式,它只找到“distinct”命名空间节点,并为每个节点生成一行名称 - 值对:
for $i in distinct-values(
for $ns in //namespace::*
return
index-of(
(for $x in //namespace::*
return
concat(name($x), ' ', string($x))
),
concat(name($ns), ' ', string($ns))
)
[1]
)
return
for $x in (//namespace::*)[$i]
return
concat(name($x), ' :', string($x), '
')
答案 2 :(得分:3)
如前面的线程所示,//namespace::*
将返回所有命名空间节点,其中有16个,根据XPath 1.0和XPath 2.0实现。如果你发现一个没有正确实现规范的实现,我并不感到惊讶。
通常无法使用XPath 1.0或XPath 2.0查找所有命名空间声明(与命名空间节点不同),因为以下两个文档在数据模型级别被认为是等效的:
文件A:
<a xmlns="one">
<b/>
</a>
文件B:
<a xmlns="one">
<b xmlns="one"/>
</a>
但是如果我们将“重要的名称空间声明”定义为子元素上但不存在于其父元素上的命名空间,那么您可以尝试使用此XPath 2.0表达式:
for $e in //* return
for $n in $e/namespace::* return
if (not(some $p in $n/../namespace::* satisfies ($p/name() eq $e/name() and string($p) eq string($n)))) then concat($e/name(), '->', $n/name(), '=', string($n)) else ()
答案 3 :(得分:0)
以下是使用.NET的XPathDocument
(XSLT / XPath 1.0数据模型),XmlDocument
(DOM数据模型)和MSXML 6的DOM的XPath 1.0实现的结果。针对您的示例XML文档运行的测试代码是
Console.WriteLine("XPathDocument:");
XPathDocument xpathDoc = new XPathDocument("../../XMLFile4.xml");
foreach (XPathNavigator nav in xpathDoc.CreateNavigator().Select("//namespace::*"))
{
Console.WriteLine("Node type: {0}; name: {1}; value: {2}.", nav.NodeType, nav.Name, nav.Value);
}
Console.WriteLine();
Console.WriteLine("DOM XmlDocument:");
XmlDocument doc = new XmlDocument();
doc.Load("../../XMLFile4.xml");
foreach (XmlNode node in doc.SelectNodes("//namespace::*"))
{
Console.WriteLine("Node type: {0}; name: {1}; value: {2}.", node.NodeType, node.Name, node.Value);
}
Console.WriteLine();
Console.WriteLine("MSXML 6 DOM:");
dynamic msxmlDoc = Activator.CreateInstance(Type.GetTypeFromProgID("Msxml2.DOMDocument.6.0"));
msxmlDoc.load("../../XMLFile4.xml");
foreach (dynamic node in msxmlDoc.selectNodes("//namespace::*"))
{
Console.WriteLine("Node type: {0}; name: {1}; value: {2}.", node.nodeType, node.name, node.nodeValue);
}
,其输出为
XPathDocument:
Node type: Namespace; name: txt; value: textnode.com.
Node type: Namespace; name: att; value: attribute.com.
Node type: Namespace; name: ele; value: element.com.
Node type: Namespace; name: xml; value: http://www.w3.org/XML/1998/namespace.
Node type: Namespace; name: txt; value: textnode.com.
Node type: Namespace; name: att; value: attribute.com.
Node type: Namespace; name: ele; value: element.com.
Node type: Namespace; name: xml; value: http://www.w3.org/XML/1998/namespace.
Node type: Namespace; name: txt; value: textnode.com.
Node type: Namespace; name: att; value: attribute.com.
Node type: Namespace; name: ele; value: element.com.
Node type: Namespace; name: xml; value: http://www.w3.org/XML/1998/namespace.
Node type: Namespace; name: txt; value: textnode.com.
Node type: Namespace; name: att; value: attribute.com.
Node type: Namespace; name: ele; value: element.com.
Node type: Namespace; name: xml; value: http://www.w3.org/XML/1998/namespace.
DOM XmlDocument:
Node type: Attribute; name: xmlns:txt; value: textnode.com.
Node type: Attribute; name: xmlns:att; value: attribute.com.
Node type: Attribute; name: xmlns:ele; value: element.com.
Node type: Attribute; name: xmlns:xml; value: http://www.w3.org/XML/1998/namespa
ce.
Node type: Attribute; name: xmlns:txt; value: textnode.com.
Node type: Attribute; name: xmlns:att; value: attribute.com.
Node type: Attribute; name: xmlns:ele; value: element.com.
Node type: Attribute; name: xmlns:xml; value: http://www.w3.org/XML/1998/namespa
ce.
Node type: Attribute; name: xmlns:txt; value: textnode.com.
Node type: Attribute; name: xmlns:att; value: attribute.com.
Node type: Attribute; name: xmlns:ele; value: element.com.
Node type: Attribute; name: xmlns:xml; value: http://www.w3.org/XML/1998/namespa
ce.
Node type: Attribute; name: xmlns:txt; value: textnode.com.
Node type: Attribute; name: xmlns:att; value: attribute.com.
Node type: Attribute; name: xmlns:ele; value: element.com.
Node type: Attribute; name: xmlns:xml; value: http://www.w3.org/XML/1998/namespa
ce.
MSXML 6 DOM:
Node type: 2; name: xmlns:xml; value: http://www.w3.org/XML/1998/namespace.
Node type: 2; name: xmlns:ele; value: element.com.
Node type: 2; name: xmlns:att; value: attribute.com.
Node type: 2; name: xmlns:txt; value: textnode.com.
Node type: 2; name: xmlns:xml; value: http://www.w3.org/XML/1998/namespace.
Node type: 2; name: xmlns:ele; value: element.com.
Node type: 2; name: xmlns:att; value: attribute.com.
Node type: 2; name: xmlns:txt; value: textnode.com.
Node type: 2; name: xmlns:xml; value: http://www.w3.org/XML/1998/namespace.
Node type: 2; name: xmlns:ele; value: element.com.
Node type: 2; name: xmlns:att; value: attribute.com.
Node type: 2; name: xmlns:txt; value: textnode.com.
Node type: 2; name: xmlns:xml; value: http://www.w3.org/XML/1998/namespace.
Node type: 2; name: xmlns:ele; value: element.com.
Node type: 2; name: xmlns:att; value: attribute.com.
Node type: 2; name: xmlns:txt; value: textnode.com.
所以它肯定不是XPath 1.0与XPath 2.0问题。我认为您看到的问题是将XPath数据模型与命名空间节点映射到具有属性节点的DOM模型的缺点。更熟悉Java XPath API的人需要告诉您,您看到的行为是否正确依赖于实现,因为API规范对于将XPath命名空间轴映射到DOM模型或者是否是错误的情况来说不够精确。 / p>