我正在解析一个html页面,并且有一个很长的CSS Selector(我找不到更短的一个,因为页面是愚蠢的)。它应该选择表中的所有tr,但只选择第二行......我缺少什么?
选择器:
body > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(2) > td:nth-child(1) > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(3) > td:nth-child(1) > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(8) > td:nth-child(1) > table:nth-child(4) > tbody:nth-child(1) > tr:nth-child(2) > td:nth-child(1) > table:nth-child(1) > tbody:nth-child(1) tr:not(:first-child)
页面里面有多个表格,但是前90%甚至不重要,选择了我想要使用的表格后,我会跟进一个“[space]tr:not(...)
”,所以它应该全部选中下行,不应该吗?
示例html页面(无法链接,您需要登录才能访问它):http://pastebin.com/gprXTvzz
选择器成功选择我想要的表格后(在选择器...> tbody:nth-child(1) tr:not(:first-child)
中),年龄如下:
<tbody>
<tr valign="bottom">
<td class="blackmedium" width="80"><b>Part Number</b></td>
<td class="blackmedium" width="100"><b>Manufacturer</b></td>
<td class="blackmedium" width="40"><b>Abbr.</b></td>
<td class="blackmedium" width="50"><b>WIX Part Number</b></td>
<td class="blackmedium" width="50"><b>Lead Time</b></td>
</tr>
<tr>
<td class="blackmedium" width="80">A0002701098</td>
<td class="blackmedium" width="100">MERCEDES-BENZ</td>
<td class="blackmedium" width="40">MBZ</td>
<td class="blackmedium" width="50"> <a href="http://www.wixindustrialfilters.com/cross.aspx?Part=W03AT780" target="_blank">W03AT780</a>
</td>
<td class="blackmedium" width="50">
STOCK
</td>
</tr>
<tr bgcolor="#e0e0e0">
<td class="blackmedium" width="80">A0002701598 Discontinued</td>
<td class="blackmedium" width="100">MERCEDES-BENZ</td>
<td class="blackmedium" width="40">MBZ</td>
<td class="blackmedium" width="50"> <a href="javascript:var w=window.open('PartDetail.asp?Part=58892','PartDetail','left=200,top=200,width=530,height=500,toolbar=no,location=no,directories=no,status=no,menubar=no,resizable=yes,scrollbars=yes');w.focus();">58892</a>
</td>
<td class="blackmedium" width="50">
</td>
</tr>
<tr>
<td class="blackmedium" width="80">A0002772395</td>
<td class="blackmedium" width="100">MERCEDES-BENZ</td>
<td class="blackmedium" width="40">MBZ</td>
<td class="blackmedium" width="50"> <a href="javascript:var w=window.open('PartDetail.asp?Part=51249','PartDetail','left=200,top=200,width=530,height=500,toolbar=no,location=no,directories=no,status=no,menubar=no,resizable=yes,scrollbars=yes');w.focus();">51249</a>
</td>
<td class="blackmedium" width="50">
</td>
</tr>
<tr bgcolor="#e0e0e0">
<td class="blackmedium" width="80">A0002772895</td>
<td class="blackmedium" width="100">MERCEDES-BENZ</td>
<td class="blackmedium" width="40">MBZ</td>
<td class="blackmedium" width="50"> <a href="javascript:var w=window.open('PartDetail.asp?Part=57701','PartDetail','left=200,top=200,width=530,height=500,toolbar=no,location=no,directories=no,status=no,menubar=no,resizable=yes,scrollbars=yes');w.focus();">57701</a>
</td>
<td class="blackmedium" width="50">
</td>
</tr>
</tbody>
答案 0 :(得分:1)
body > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(2) > td:nth-child(1) > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(3) > td:nth-child(1) > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(8) > td:nth-child(1) > table:nth-child(4) > tbody:nth-child(1) > tr:nth-child(2) > td:nth-child(1) > table:nth-child(1) > tbody:nth-child(1) tr:not(:first-child)
不完全回答你的问题,但是如果标记不是解析友好的并且我需要在可怕的标记table
元素中找到深度嵌套,我更喜欢通过存在来找到它其中的特定标题。在这种情况下,您可以找到具有Part Number
标题的表。示例XPath:
//table[tr[1]/td/b = "Part Number"]
然后,在此表上,您可以使用"not first child" CSS选择器:
tr:not(:first-child)
或者,您也可以在tr
元素之后使用adjacent selector(查找tr
元素,这会在逻辑上排除第一行):
tr + tr
希望这会简化事情。