CSS Selector仅选择第一行

时间:2016-02-05 15:36:24

标签: html css-selectors html-parsing

我正在解析一个html页面,并且有一个很长的CSS Selector(我找不到更短的一个,因为页面是愚蠢的)。它应该选择表中的所有tr,但只选择第二行......我缺少什么?

选择器:

body > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(2) > td:nth-child(1) > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(3) > td:nth-child(1) > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(8) > td:nth-child(1) > table:nth-child(4) > tbody:nth-child(1) > tr:nth-child(2) > td:nth-child(1) > table:nth-child(1) > tbody:nth-child(1) tr:not(:first-child)

页面里面有多个表格,但是前90%甚至不重要,选择了我想要使用的表格后,我会跟进一个“[space]tr:not(...)”,所以它应该全部选中下行,不应该吗?

示例html页面(无法链接,您需要登录才能访问它):http://pastebin.com/gprXTvzz

选择器成功选择我想要的表格后(在选择器...> tbody:nth-child(1) tr:not(:first-child)中),年龄如下:

<tbody>
   <tr valign="bottom">
      <td class="blackmedium" width="80"><b>Part Number</b></td>
      <td class="blackmedium" width="100"><b>Manufacturer</b></td>
      <td class="blackmedium" width="40"><b>Abbr.</b></td>
      <td class="blackmedium" width="50"><b>WIX Part Number</b></td>
      <td class="blackmedium" width="50"><b>Lead Time</b></td>
   </tr>
   <tr>
      <td class="blackmedium" width="80">A0002701098</td>
      <td class="blackmedium" width="100">MERCEDES-BENZ</td>
      <td class="blackmedium" width="40">MBZ</td>
      <td class="blackmedium" width="50"> <a href="http://www.wixindustrialfilters.com/cross.aspx?Part=W03AT780" target="_blank">W03AT780</a>
      </td>
      <td class="blackmedium" width="50">
         STOCK
      </td>
   </tr>
   <tr bgcolor="#e0e0e0">
      <td class="blackmedium" width="80">A0002701598 Discontinued</td>
      <td class="blackmedium" width="100">MERCEDES-BENZ</td>
      <td class="blackmedium" width="40">MBZ</td>
      <td class="blackmedium" width="50"> <a href="javascript:var w=window.open('PartDetail.asp?Part=58892','PartDetail','left=200,top=200,width=530,height=500,toolbar=no,location=no,directories=no,status=no,menubar=no,resizable=yes,scrollbars=yes');w.focus();">58892</a>
      </td>
      <td class="blackmedium" width="50">
      </td>
   </tr>
   <tr>
      <td class="blackmedium" width="80">A0002772395</td>
      <td class="blackmedium" width="100">MERCEDES-BENZ</td>
      <td class="blackmedium" width="40">MBZ</td>
      <td class="blackmedium" width="50"> <a href="javascript:var w=window.open('PartDetail.asp?Part=51249','PartDetail','left=200,top=200,width=530,height=500,toolbar=no,location=no,directories=no,status=no,menubar=no,resizable=yes,scrollbars=yes');w.focus();">51249</a>
      </td>
      <td class="blackmedium" width="50">
      </td>
   </tr>
   <tr bgcolor="#e0e0e0">
      <td class="blackmedium" width="80">A0002772895</td>
      <td class="blackmedium" width="100">MERCEDES-BENZ</td>
      <td class="blackmedium" width="40">MBZ</td>
      <td class="blackmedium" width="50"> <a href="javascript:var w=window.open('PartDetail.asp?Part=57701','PartDetail','left=200,top=200,width=530,height=500,toolbar=no,location=no,directories=no,status=no,menubar=no,resizable=yes,scrollbars=yes');w.focus();">57701</a>
      </td>
      <td class="blackmedium" width="50">
      </td>
   </tr>
</tbody>

1 个答案:

答案 0 :(得分:1)

  

body > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(2) > td:nth-child(1) > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(3) > td:nth-child(1) > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(8) > td:nth-child(1) > table:nth-child(4) > tbody:nth-child(1) > tr:nth-child(2) > td:nth-child(1) > table:nth-child(1) > tbody:nth-child(1) tr:not(:first-child)

不完全回答你的问题,但是如果标记不是解析友好的并且我需要在可怕的标记table元素中找到深度嵌套,我更喜欢通过存在来找到它其中的特定标题。在这种情况下,您可以找到具有Part Number标题的表。示例XPath:

//table[tr[1]/td/b = "Part Number"]

然后,在此表上,您可以使用"not first child" CSS选择器:

tr:not(:first-child)

或者,您也可以在tr元素之后使用adjacent selector(查找tr元素,这会在逻辑上排除第一行):

tr + tr

希望这会简化事情。