Question

这是我的代码：

  <tr>
      <td height="34" class="normal">4893</td>
      <td class="normal">Public Utilities Commission </td>
      <td class="normal">Investigation to Examine </td>. 
   </tr>
   <tr>
      <td height="34" rowspan="2" class="normal"><a 
             href="docket/4892page.html">4892</a></td>
      <td class="normal"><p>RI Distribution Genration 
            Boardd</p></td>
      <td class="normal">2019 Renewable Energy </td>
    </tr>
    <tr>
      <td class="normal">The Narragansett Ele</td>
      <td class="normal">2018 Renewable Energy </td>
    </tr>
    <tr>
      <td height="34" class="normal"><a 
           href="docket/4891page.html">4891</a></td>
      <td class="normal">Kearsarge Uxbridge, LLC </td>
      <td class="normal">Renewable Energy</td>
    </tr>

在第二行<tr>中，rowspan =“ 2”，我想将第一行<td>的内容（即4892）应用于下一个<tr>，其中有两个<td>。我已经尝试了以下方法，但是不起作用：

        item['id'] = row.xpath('.//tr//td[1]//text()').extract()

        if not item['id']:
            item['id'] = row.xpath('.//[preceding- 
                                      sibling::tr//td[1]//text()').extract()

Answer 1

因此，您实际上不是在“选择行跨度”，而是在“通过行跨度选择”。

您可以尝试几种方法。

在rowspan存在时选择它：

# CSS row.css('tr td[rowspan]::text') # XPath row.xpath('//tr/td[@rowspan]/text()')

在rowspan具有特定值（此处为“ 2”）时选择它：

# CSS row.css('tr td[rowspan=2]::text') # XPath row.xpath('//tr/td[@rowspan="2"]/text()')

另请参阅：

https://www.w3schools.com/cssref/css_selectors.asp

https://www.w3schools.com/xml/xpath_syntax.asp

Scrapy：如何选择跨行

1 个答案: