Question

我正在使用以下结构抓取网站：

<tbody>
   <tr class='Leaguestitle'>
      <td>...<\td>
      <td>...<\td>
   <\tr>
   <tr id='tr1_abababa'>
      <td>...<\td>
      <td>...<\td>
   <\tr>
   <tr id='tr2_abababa'>..<\tr>
    .
    .
   <tr id='tr1_acacaca'>..<\tr>
   <tr id='tr2_acacaca'>..<\tr>
   <tr align='center'>..<\tr>
    .
    .
   <tr id='tr1_cbcbcbc'>..<\tr>
   <tr id='tr2_cbcbcbc'>--<\tr>
<\tbody>

当我停止查询时，我想要的是循环遍历一个类，并且所有tr都使用其id中的tr1，直到我到达具有对齐中心的节点。为此，我尝试使用以下xpath：

allrows=table.find_elements_by_xpath(
        './/tr[@class="Leaguestitle"] | .//tr[contains(@id,"tr1")] | .//tr[@align="center"]')

我对每个节点进行分类的想法是这样的：

for row in allrows:

   try:
     if 'Leaguestitle' in row.get_attribute('class'): something
   except:pass       

   try:
     if 'tr1' in row.get_attribute('id'): something else
   except:pass

   try:       
     if 'center' in row.get_attribute('align'): break
   except:pass

问题是，我得到的节点不是结构

<tr attributes>
  <td>...<\td>
  <td>...<\td>
<\tr>

但直接所有的儿童标签。为了尝试解决它，我做了

for row in allrows:
   row=row.find_element_by_xpath('..')

当打印给我整个父标记时，我仍然无法使用我的分类代码，因为get_attribute返回空结果。

缺少什么？

Answer 1

尝试以下方法：

String space="";

在xpath查询中按父节点的属性选择

1 个答案: