Question

这里的内容类似于我正在使用的HTML：

<body>

  <tr class="heading">
    <td colspan="2"> Heading 1 </td>
  </tr>

  <tr>
    <td>L 1</td>
    <td>R 1</td>
  </tr>

  <tr>
    <td>L 2</td>
    <td>R 2</td>
  </tr>

  <tr class="heading">
    <td colspan="2"> Heading 2</td>
  </tr>

  <tr>
    <td>L 3</td>
    <td>R 3</td>
  </tr>

</body>

我希望在＆＃34;标题1＆＃34;之后的td[1] s中获取所有tr s但是没有任何事情发生在＆＃34; Heading 2＆＃34; （或包括＆＃34;标题2＆＃34;）。

理想情况下，我需要能够只用＆＃34;标题1＆＃34;作为输入 - 我希望我提供的标题下的所有元素，但忽略新标题下的任何内容。

这在XPath中是否可行？

Answer 1

我从你删除的答案中获取了代码并使其正常工作......而且丑陋：

(//tr[preceding-sibling::tr[@class='heading' and td=' Heading 1 '] and following-sibling::tr[@class='heading']]/td[1] ) | (//tr[preceding-sibling::tr[@class='heading' and td=' Heading 1 '] and following-sibling::tr[@class='heading']]/td[2] )

如果您使用的是编程语言，最好在代码中使用它。

Answer 2

到目前为止，我不同意任何答案。一个完全按照你要求执行的XPath表达式是

//tr[@class = 'heading' and normalize-space(td) = 'Heading 1']/following::td[following::tr[@class = 'heading' and normalize-space(td) = 'Heading 2']]

转换为

//tr                                     select all `tr` elements anywhere in the document
[@class = 'heading'                      but only if they have a `class` attribute whose
                                         value is equal to "heading"                
and normalize-space(td) = 'Heading 1']   and only if they contain a `td` element which has
                                         a string value of "Heading 1".
/following::td                           select all `td` elements that follow them
[following::tr                           but only if they are followed by a `tr` element
[@class = 'heading'                      which again has a `class` attribute with "heading"
                                         as its value
and normalize-space(td) = 'Heading 2']]  and only if this `tr` element has a `td` child
                                         element with "Heading 2" as its string value

将返回以下内容（由------分隔的各个结果）：

<td>L 1</td>
-----------------------
<td>R 1</td>
-----------------------
<td>L 2</td>
-----------------------
<td>R 2</td>

normalize-space()函数用于去除尾随空格的字符串。

编辑：如果您打算只选择多个td元素的第一个：

//tr[@class = 'heading' and normalize-space(td) = 'Heading 1']/following::tr/td[position() = 1 and following::tr[@class = 'heading' and normalize-space(td) = 'Heading 2']]

，结果将是

<td>L 1</td>
-----------------------
<td>L 2</td>

为了更加完整，要考虑以下情况：

<body>

  <tr class="heading">
    <td colspan="2"> Heading 1 </td>
  </tr>

  <tr>
    <td>L 1</td>
    <td>R 1</td>
<td>third</td>
  </tr>

  <tr>
    <td>L 2</td>
    <td>R 2</td>
  </tr>

  <tr class="heading">
    <td colspan="2"> Heading other</td>
  </tr>

  <tr>
    <td>L 3</td>
    <td>R 3</td>
  </tr>

<tr class="heading">
    <td colspan="2"> Heading 2</td>
  </tr>

</body>

其间有不相关的标题＆＃34;标题1＆＃34;和＃34;标题2＆＃34;，其子td元素不应出现在结果中，请使用

//tr[@class = 'heading' and normalize-space(td) = 'Heading 1']/following::tr[not(@class)]/td[position() = 1 and following::tr[@class = 'heading' and normalize-space(td) = 'Heading 2']]

修改：

目前，你的xpath找到2个标题之间的元素，但是在页面页面的最后一个组的情况下，没有要引用的第2个标题。

到目前为止，您没有解释实际数据中的情况。使用

//tr[@class = 'heading' and normalize-space(td) = 'Heading 1']/following::tr[not(@class)]/td[position() = 1 and not(preceding::tr[@class = 'heading' and normalize-space(td) = 'Heading 2'])]

编辑2 ：

我这样做，但我也添加了注释＆＃34;理想情况下，我需要能够只用＆＃34;标题1＆＃34;作为输入 - 我想要我提供的标题下的所有元素，但忽略新标题下的任何内容。＆＃34;

//tr[@class = 'heading' and normalize-space(td) = 'Heading 1']/following::tr[not(@class)]/td[position() = 1 and not(preceding::tr[@class = 'heading' and normalize-space(td) != 'Heading 1'])]

Answer 3

您可以这样做：

  <tr class="heading">
    <td colspan="2"> Heading 1 </td>
  </tr>

  <tr>
    <td class="left">L 1</td>
    <td class="right">R 1</td>
  </tr>

  <tr>
    <td class="left">L 2</td>
    <td class="right">R 2</td>
  </tr>

  <tr class="heading">
    <td colspan="2"> Heading 2</td>
  </tr>

  <tr>
    <td class="left">L 3</td>
    <td class="right">R 3</td>
  </tr>

</body>

然后在css中指定所需的颜色。

获取具有特定属性的2个元素之间的元素

3 个答案: