从命名空间的xml中提取包含其全部内容的节点

时间:2015-01-07 11:39:26

标签: java xml xpath

给出以下命名空间的xml文件:

<ptk:PrintTalk xmlns:ptk="http://linkToNameSpace"> xmlns:xjdf="http://linkToNamespace"
 <ptk:Request>
  <ptk:PurchaseOrder Currency="EUR">
   <xjdf:XJDF name="someName" version="2.0">
     <xjdf:ProductList>
      <xjdf:Product>
       ...
      </xjdf:Product>
      <xjdf:OtherProduct>
       ...
      </xjdf:OtherProduct> 
      and many other products
     </xjdf:ProductList>
     <xjdf:ParameterSet>
      <xjdf:Parameter>
       ...
      </xjdf:Parameter> and so on until
   </xjdf:XJDF>
  </ptk:PurchaseOrder>
 </ptk:Request>
</ptk:PrintTalk>

如何使用XPath提取以下内容:

<xjdf:XJDF name="someName" version="2.0">
 <xjdf:ProductList>
  <xjdf:Product>
   ...
  </xjdf:Product>
  <xjdf:OtherProduct>
   ...
  </xjdf:OtherProduct> 
   and many other products
  </xjdf:ProductList>
   <xjdf:ParameterSet>
    <xjdf:Parameter>
     ...
    </xjdf:Parameter> and so on until
</xjdf:XJDF>

我已尝试过类似的内容:

/ptk:PrintTalk/ptk:Request/ptk:PurchaseOrder/* 

//xjdf:XJDF

但是这些表达方式并不是我想要的结果。我使用IntellijIdea内置的xpath表达式求值程序,编程语言是java。没有xpath的库 - 只是java.xml。*

更新

使用

//ptk:PurchaseOrder//*

我将每个节点都作为单个节点,而不包含任何子节点,e。 G。将

<xjdf:ProductList>
 <xjdf:Product>
  ...
 </xjdf:Product>
</xjdf:ProductList> (here the product tag is a child of product list tag)

结果

<xjdf:ProuctList>
<xjdf:Product>

我用来执行操作的java代码:

@Override
public XJDF readFrom(
    final Class<XJDF> type, final Type genericType, final Annotation[] annotations, final MediaType mediaType,
    final MultivaluedMap<String, String> multivaluedMap, final InputStream inputStream
) throws IOException {
    try {
        DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
        Document documentPtk = documentBuilder.parse(new InputSource(inputStream));
        XPathFactory xPathFactory = XPathFactory.newInstance();
        XPath xPath = xPathFactory.newXPath();
        XPathExpression xPathExpression = xPath.compile("//ptk:PurchaseOrder//*");
        Document documentXjdf = (Document) xPathExpression.evaluate(documentPtk, XPathConstants.NODE);
    } catch (Exception e) {
        throw new WebApplicationException("PrintTalk document could not be deserialized.", e);
    }
}

1 个答案:

答案 0 :(得分:2)

这里有三个要点:

  • DocumentBuilderFactory默认情况下不支持名称空间,您必须在创建DocumentBuilder
  • 之前显式切换名称空间
  • XPath不使用XML文档中的命名空间前缀映射,它使用自己的NamespaceContext代替
  • 此查询返回的Node不是Document,而是Element

令人讨厌的是,Java核心类库中没有NamespaceContext的默认实现,因此您必须使用第三方(我通常使用SimpleNamespaceContext from Spring)或编写您自己的接口实现。

以下是使用SimpleNamespaceContext的示例:

DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
documentBuilderFactory.setNamespaceAware(true);
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
Document documentPtk = documentBuilder.parse(new InputSource(inputStream));
XPathFactory xPathFactory = XPathFactory.newInstance();
XPath xPath = xPathFactory.newXPath();

SimpleNamespaceContext nsCtx = new SimpleNamespaceContext();
nsCtx.bindNamespaceUri("p", "http://linkToNameSpace");
xPath.setNamespaceContext(nsCtx);

XPathExpression xPathExpression = xPath.compile("/p:PrintTalk/p:Request/p:PurchaseOrder/*");
Element documentXjdf = (Element) xPathExpression.evaluate(documentPtk, XPathConstants.NODE);