XPATH:解析XML需要同级的子级和更多子级

时间:2018-10-16 18:15:58

标签: xml xpath

所以我有这个

<?xml version="1.0" encoding="UTF-8"?>
<ClinicalDocument xmlns="urn:hl7-org:v3">
  <realmCode code="US" />
  <typeId extension="POCD_HD000040" root="2.16.840.1.113883.1.3" />
  <templateId root="1.2.840.114350.1.72.1.51693" />
  <templateId root="2.16.840.1.113883.10.20.22.1.1" />
  <templateId root="2.16.840.1.113883.10.20.22.1.1" extension="2015-08-01" />
  <templateId root="2.16.840.1.113883.10.20.22.1.2" />
  <templateId root="2.16.840.1.113883.10.20.22.1.2" extension="2015-08-01" />
  <id assigningAuthorityName="EPC" root="1.2.840.114350.1.13.535.2.7.8.688883.17473398" />
  <code code="34133-9" codeSystem="2.16.840.1.113883.6.1" codeSystemName="LOINC" displayName="Summarization of Episode Note" />
  <title>Clinical Summary</title>
  <effectiveTime value="20181016153816-0400" />
  <confidentialityCode code="N" codeSystem="2.16.840.1.113883.5.25" displayName="Normal" />
  <languageCode code="en-US" />
  <setId assigningAuthorityName="EPC" extension="d5ccd6e6-4b6b-11e7-90e8-f508dff85edf" root="1.2.840.114350.1.13.535.2.7.1.1" />
  <versionNumber value="31" />
  <recordTarget>

这部分下降了,我需要在其中提取所需的数据

          <code code="10160-0" codeSystem="2.16.840.1.113883.6.1" codeSystemName="LOINC" displayName="History of Medication Usage" />
          <title>Current Medications</title>
          <text>
             <table>
                <colgroup>
                   <col width="25%" />
                   <col width="25%" />
                   <col width="13%" />
                   <col width="12%" />
                   <col width="8%" />
                   <col width="8%" />
                   <col width="9%" />
                </colgroup>
                <thead>
                   <tr>
                      <th>Prescription</th>
                      <th>Sig.</th>
                      <th>Disp.</th>
                      <th>Refills</th>
                      <th>Start Date</th>
                      <th>End Date</th>
                      <th>Status</th>
                   </tr>
                </thead>
                <tbody>
                   <tr ID="currx6">
                      <td>
                         <paragraph ID="med6">Misc. Devices (BATH/SHOWER SEAT) Misc</paragraph>
                         <content styleCode="allIndent">
                            Indications:
                            <content ID="indication7">Mild cognitive impairment</content>
                            ,
                            <content ID="indication8">MGD (meibomian gland disease)</content>
                            ,
                            <content ID="indication9">Glaucoma suspect</content>
                            ,
                            <content ID="indication10">Nuclear sclerosis</content>
                         </content>
                      </td>
                      <td ID="sig6">Pt needs shower/bath bar to assist with getting in and out of bath tub/shower.</td>
                      <td>
                         <paragraph>1 Units</paragraph>
                      </td>
                      <td>0</td>
                      <td>06/21/2013</td>
                      <td />
                      <td>Active</td>
                   </tr>
                   <tr ID="currx11">
                      <td>
                         <paragraph ID="med11">Misc. Devices (HUGO ROLLING WALKER) Misc</paragraph>

我几乎正在尝试获取仅具有ID的段落。我在用这个

NodeList nodeList = (NodeList) xpath.evaluate(  "//*[local-name()='code'][@code='10160-0']/following-sibling::*[local-name()='text']/table/tbody/tr/td/paragraph", new InputSource(new StringReader(docString)), XPathConstants.NODESET);

但是它一直告诉我我有0个节点...,如果我尝试获取表,它告诉我我有1个节点..但是它为null ..我到底在做什么错? / p>

解决方案:获取段落

//*[local-name()='code'][@code='10160-0']/following-sibling::*[local-name()='text']//*[local-name()='paragraph']

获取唯一的ID

//*[local-name()='code'][@code='10160-0']/following-sibling::*[local-name()='text']//*[local-name()='paragraph'[@ID]]

1 个答案:

答案 0 :(得分:1)

  

我非常想获取仅具有ID的段落。

此XPath,

//*[@ID]

将选择所有具有ID属性的元素以及该XPath,

//paragraph[@ID]

将选择所有具有ID属性的段落元素。

其他说明:

  • 当没有名称空间在起作用时,请勿使用诸如//*[local-name()='code']之类的结构;只需使用//code。 (并且如果使用名称空间,请定义名称空间前缀并正确引用它们,而不要破坏它们。请参见How does XPath deal with XML namespaces?
  • //*[local-name()='code'][@code='10160-0']/following-sibling::*[local-name()='text']失败,因为text不是node的兄弟姐妹。也许您打算改用following::