Question

<subjectOf typeCode="SUBJ">
    <annotation classCode="ACT" moodCode="EVN">
        <realmCode code="QD" />
        <code code="SPECIALNOTE"></code>
        <text><![CDATA[<strong>** New York State approval pending. This test is not available for New York State patient testing **</br> ]]></text>
    </annotation>
</subjectOf>
<subjectOf typeCode="SUBJ">
    <annotation classCode="ACT" moodCode="EVN">
        <realmCode code="QD" />
        <code code="PREFERREDSPECIMEN"></code>
        <text><![CDATA[2 mL Second void urine <strong>or </strong>2-hour urine <strong>or </strong>&nbsp;2 mL Urine with no preservative]]></text>
    </annotation>
</subjectOf>

在DOM解析中，如何遍历上述XML并根据具有给定值的<text>标记属性获取<code>标记值。例如，我想获得以下文字：

<strong> **纽约州批准待审。此测试不可用纽约州患者检测** </br>

...基于<code>标记，code属性value="SPECIALNOTE"。

public static void main(String[] args) throws ParserConfigurationException, SAXException, IOException, XPathExpressionException {      
    DocumentBuilderFactory domFactory =  DocumentBuilderFactory.newInstance();          
    domFactory.setNamespaceAware(true);      
    DocumentBuilder builder = domFactory.newDocumentBuilder();     
    Document doc = builder.parse("xml.xml");     
    XPath xpath = XPathFactory.newInstance().newXPath();        // XPath Query for showing all nodes value     

    XPathExpression expr = xpath.compile("/testCodeIdentifier/subjectOf/subjectOf/annotation/code[@code='SPECIALNOTE']");      
    Object result = expr.evaluate(doc, XPathConstants.NODESET);     
    NodeList nodes = (NodeList) result;     
    for (int i = 0; i < nodes.getLength(); i++) {      
        System.out.println("........"+nodes.item(i).getNodeValue()+"........");      
        }   
    } 
}

提前感谢帮助...

Answer 1

修复你的XPath表达式：

/testCodeIdentifier/subjectOf/annotation[code/@code='SPECIALNOTE']/text

然后，您可以使用

访问CDATA内容

Node.getTextContent();

更新：上面的XPath在发布时似乎是正确的。与此同时，您已完全更改了XML代码，现在，XPath将会读取

/testCodeIdentifier/subjectOf/code/subjectOf/annotation[code/@code='SPECIALNOTE']/text

或者，因为我猜这个问题太乱了，所以仍然是错的，只是这样做：

//annotation[code/@code='SPECIALNOTE']/text

Answer 2

首先，您的XPath表达式有错误; subjectOf被不必要地重复：

/subjectOf/subjectOf

现在，假设您确实需要引用目标code元素之前的text节点，请使用以下内容：

XPathExpression expr = xpath.compile(
    "/testCodeIdentifier/subjectOf/annotation/code[@code='SPECIALNOTE']");
Node node = (Node) expr.evaluate(doc, XPathConstants.NODE);
System.out.println(getNextElementSibling(node).getTextContent());

getNextElementSibling的定义如下：

public static Node getNextElementSibling(Node node) {
    Node next = node;
    do {
        next = next.getNextSibling();
    } while ((next != null) && (next.getNodeType() != Node.ELEMENT_NODE));
    return next;
}

关于此的几点说明：

getNextSibling最初不适合您的原因（很可能）是因为引用的code元素的下一个兄弟是文本节点，而不是元素节点。（code和text之间的空白很重要。）这就是我们需要getNextElementSibling的原因。
我们选择的是单个节点，因此如果XPathConstants.NODE

XPathConstants.NODELIST

请注意，你可能应该像@Lukas一样建议并修改你的XPath表达式来直接选择目标文本。

以下是如何直接获取文本（作为字符串）：

XPathExpression expr = xpath.compile(
    "/testCodeIdentifier/subjectOf/annotation[code/@code='SPECIALNOTE']/text/text()");
String text = (String) expr.evaluate(doc, XPathConstants.STRING);
System.out.println(text);

以下是如何首先获取对元素的引用，然后检索其CDATA部分的内容：

XPathExpression expr = xpath.compile(
    "/testCodeIdentifier/subjectOf/annotation[code/@code='SPECIALNOTE']/text");
Node text = (Node) expr.evaluate(doc, XPathConstants.NODE);
System.out.println(text.getTextContent());

Answer 3

最后，我自己得到了我的问题的答案....下面的代码正在解析我的XML ...

  XPath xpath = XPathFactory.newInstance().newXPath();
   // XPath Query for showing all nodes value
  XPathExpression expr = xpath.compile("//testCodeIdentifier/subjectOf/order/subjectOf/annotation/code[@code='SPECIALNOTE']/following-sibling::text/text()");

  Object result = expr.evaluate(doc, XPathConstants.NODESET);
  NodeList nodes = (NodeList) result;
  for (int i = 0; i < nodes.getLength(); i++) {

      System.out.println(nodes.item(i).getNodeValue()); 

  }

感谢在这篇文章中有回答的人，但这是一个可能的解决方案。上面有一个标记。

JAVA中的DOM解析器查询

3 个答案: