Question

我有一个像这样的xml，

<doc>
    <p><c type="changeStart"/><style type="underline">text</style><c type="changeEnd"/><t/>In addition 
        to voting Finance Company and Business Company, Inc.: (i) the name 
        of the <c type="changeStart"/>new public entity<c type="changeEnd"/> will be “Finance Company.” 
        as Finance Company’s corporate existence is perpetual as opposed to Business Company, Inc.’s corprate 
        existence terminating <c type="changeStart"/>   <c type="changeEnd"/>
    </p>
</doc>

我需要选择<c type="changeStart"/>和<c type="changeEnd"/>之间存在的节点。所以在上面的xml中应该选择以下节点，

<style type="underline">text</style>
new public entity
两个空格（''）

我为此写了一篇关于xpath的文章，

//*[preceding-sibling::c[@type = 'changeStart'] and following-sibling::c[@type = 'changeEnd']][not(c [@type="changeStart"])]

但它没有选择正确的节点。任何建议如何修改我的xpath以选择我需要的东西？

Answer 1

这是一个可能的XPath：

//node()[
    preceding-sibling::*[1][self::c/@type='changeStart']
        and
    following-sibling::*[1][self::c/@type='changeEnd']
]

以上XPath选择节点：

直接位于之后<c type="changeStart"/>和..
通过<c type="changeEnd"/>

直接关注

Answer 2

使用此XPath 1.0表达式（类似于@Flynn1179的方法，但使用严格的>比较，并且还有排除节点的限制，其父节点也在更改的节点中，并从结果中排除<c>元素本身。另外，我使用preceding::轴而不是preceding-sibling::轴。这允许<c>开始和结束键入元素在文档中的不同级别（而不是兄弟））：

 //node()
          [count(preceding::c[@type='changeStart']) 
           > count(preceding::c[@type='changeEnd'])
           and not(parent::*
                 [count(preceding::c[@type='changeStart']) 
                 > count(preceding::c[@type='changeEnd'])]
                 )
           and not(self::c[@type[.='changeStart' or .='changeEnd']])
          ]

基于XSLT的验证：

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

  <xsl:template match="/">
    <xsl:for-each select=
    "//node()
              [count(preceding::c[@type='changeStart']) 
               > count(preceding::c[@type='changeEnd'])
               and not(parent::*
                     [count(preceding::c[@type='changeStart']) 
                     > count(preceding::c[@type='changeEnd'])]
                     )
               and not(self::c[@type[.='changeStart' or .='changeEnd']])
              ]">
   <xsl:value-of select="concat('&#xA;',position(), '. ')"/>
   <xsl:copy-of select="."/>
  </xsl:for-each>
  </xsl:template>
</xsl:stylesheet>

在提供的XML文档上应用此转换时：

<doc>
    <p><c type="changeStart"/><style type="underline">text</style><c type="changeEnd"/><t/>In addition 
        to voting Finance Company and Business Company, Inc.: (i) the name 
        of the <c type="changeStart"/>new public entity<c type="changeEnd"/> will be “Finance Company.” 
        as Finance Company’s corporate existence is perpetual as opposed to Business Company, Inc.’s corprate 
        existence terminating <c type="changeStart"/>   <c type="changeEnd"/>
    </p>
</doc>

产生了正确的结果：

1. <style type="underline">text</style>
2. new public entity

为了获得带有两个空格的文本节点，请从上面的转换中删除<xsl:strip-space elements="*"/>声明。

这是一个更具挑战性的案例。在此XML文档上应用转换：

<t>
  <x>
   <c type="changeStart"/>
    <y> content
      <z>
        <p>

        </p>
      </z>
    </y>
   <c type="changeStart"/>
    <v>

    </v>
   <c type="changeEnd"/>
   <r/>
   <c type="changeEnd"/>
   <s/>
  </x>
</t>

产生了正确的结果：

1. <y> content
      <z>
      <p/>
   </z>
</y>
2. <v/>
3. <r/>

Answer 3

这是另一种方法：

//node()[
    count(preceding-sibling::c[@type='changeStart']) !=
    count((. | preceding-sibling::c)[@type='changeEnd'])
  ]

然而，这非常依赖于您的'changeStart'和'changeEnd'标记正确完成，具有开始和结束对。如果你能保证会出现这种情况，那就应该接近你想要的了。

Xpath - 根据前兄弟姐妹和兄弟姐妹

3 个答案: