获取具有特定属性的2个元素之间的元素

时间:2016-02-15 16:55:21

标签: html xpath

这里的内容类似于我正在使用的HTML:

<body>

  <tr class="heading">
    <td colspan="2"> Heading 1 </td>
  </tr>

  <tr>
    <td>L 1</td>
    <td>R 1</td>
  </tr>

  <tr>
    <td>L 2</td>
    <td>R 2</td>
  </tr>

  <tr class="heading">
    <td colspan="2"> Heading 2</td>
  </tr>

  <tr>
    <td>L 3</td>
    <td>R 3</td>
  </tr>

</body>

我希望在&#34;标题1&#34;之后的td[1] s中获取所有tr s但是没有任何事情发生在&#34; Heading 2&#34; (或包括&#34;标题2&#34;)。

理想情况下,我需要能够只用&#34;标题1&#34;作为输入 - 我希望我提供的标题下的所有元素,但忽略新标题下的任何内容。

这在XPath中是否可行?

3 个答案:

答案 0 :(得分:1)

我从你删除的答案中获取了代码并使其正常工作......而且丑陋:

(//tr[preceding-sibling::tr[@class='heading' and td=' Heading 1 '] and following-sibling::tr[@class='heading']]/td[1] ) | (//tr[preceding-sibling::tr[@class='heading' and td=' Heading 1 '] and following-sibling::tr[@class='heading']]/td[2] )

如果您使用的是编程语言,最好在代码中使用它。

答案 1 :(得分:1)

到目前为止,我不同意任何答案。一个完全按照你要求执行的XPath表达式是

//tr[@class = 'heading' and normalize-space(td) = 'Heading 1']/following::td[following::tr[@class = 'heading' and normalize-space(td) = 'Heading 2']]

转换为

//tr                                     select all `tr` elements anywhere in the document
[@class = 'heading'                      but only if they have a `class` attribute whose
                                         value is equal to "heading"                
and normalize-space(td) = 'Heading 1']   and only if they contain a `td` element which has
                                         a string value of "Heading 1".
/following::td                           select all `td` elements that follow them
[following::tr                           but only if they are followed by a `tr` element
[@class = 'heading'                      which again has a `class` attribute with "heading"
                                         as its value
and normalize-space(td) = 'Heading 2']]  and only if this `tr` element has a `td` child
                                         element with "Heading 2" as its string value

将返回以下内容(由------分隔的各个结果):

<td>L 1</td>
-----------------------
<td>R 1</td>
-----------------------
<td>L 2</td>
-----------------------
<td>R 2</td>

normalize-space()函数用于去除尾随空格的字符串。

编辑:如果您打算只选择多个td元素的第一个

//tr[@class = 'heading' and normalize-space(td) = 'Heading 1']/following::tr/td[position() = 1 and following::tr[@class = 'heading' and normalize-space(td) = 'Heading 2']]

,结果将是

<td>L 1</td>
-----------------------
<td>L 2</td>

为了更加完整,要考虑以下情况:

<body>

  <tr class="heading">
    <td colspan="2"> Heading 1 </td>
  </tr>

  <tr>
    <td>L 1</td>
    <td>R 1</td>
<td>third</td>
  </tr>

  <tr>
    <td>L 2</td>
    <td>R 2</td>
  </tr>

  <tr class="heading">
    <td colspan="2"> Heading other</td>
  </tr>

  <tr>
    <td>L 3</td>
    <td>R 3</td>
  </tr>

<tr class="heading">
    <td colspan="2"> Heading 2</td>
  </tr>

</body>

其间有不相关的标题&#34;标题1&#34;和#34;标题2&#34;,其子td元素不应出现在结果中,请使用

//tr[@class = 'heading' and normalize-space(td) = 'Heading 1']/following::tr[not(@class)]/td[position() = 1 and following::tr[@class = 'heading' and normalize-space(td) = 'Heading 2']]

修改

  

目前,你的xpath找到2个标题之间的元素,但是在页面页面的最后一个组的情况下,没有要引用的第2个标题。

到目前为止,您没有解释实际数据中的情况。使用

//tr[@class = 'heading' and normalize-space(td) = 'Heading 1']/following::tr[not(@class)]/td[position() = 1 and not(preceding::tr[@class = 'heading' and normalize-space(td) = 'Heading 2'])]

编辑2

  

我这样做,但我也添加了注释&#34;理想情况下,我需要能够只用&#34;标题1&#34;作为输入 - 我想要我提供的标题下的所有元素,但忽略新标题下的任何内容。&#34;

//tr[@class = 'heading' and normalize-space(td) = 'Heading 1']/following::tr[not(@class)]/td[position() = 1 and not(preceding::tr[@class = 'heading' and normalize-space(td) != 'Heading 1'])]

答案 2 :(得分:-1)

您可以这样做:

  <tr class="heading">
    <td colspan="2"> Heading 1 </td>
  </tr>

  <tr>
    <td class="left">L 1</td>
    <td class="right">R 1</td>
  </tr>

  <tr>
    <td class="left">L 2</td>
    <td class="right">R 2</td>
  </tr>

  <tr class="heading">
    <td colspan="2"> Heading 2</td>
  </tr>

  <tr>
    <td class="left">L 3</td>
    <td class="right">R 3</td>
  </tr>

</body>

然后在css中指定所需的颜色。