Python和beautifulSoup,find_all ordening

时间:2017-11-21 01:31:32

标签: python beautifulsoup

我试图废弃一个网站,那里有我的代码:

<tr class="order-by-pos" data-pos="1">
<td class="normal-td td-center td-pos">
                                        1st                                            
      <div class="race-pos-no race-pos-no-2">2</div>
 </td>
</tr>




<tr class="order-by-pos" data-pos="2">
   <td class="normal-td td-center td-pos">
                                            2nd                                           
      <div class="race-pos-no race-pos-no-1">1</div>
   </td>
</tr>

我的代码是:

registros = html.find_all(class_="order-by-pos")

for entrada in registros:

    saludos = entrada.find(class_="normal-td td-runner").get_text()
    trainer = entrada.find(class_="normal-td font-12 td-trainer").get_text()

    print (saludos,trainer)

比赛中没有比赛-pos-no-1&#34;而不是&#34; by-pos-pos&#34;这是第一行

1 个答案:

答案 0 :(得分:0)

如果您尝试从每行内部获取两位文本,可以使用以下方法:

from bs4 import BeautifulSoup

html = """<tr class="order-by-pos" data-pos="1">
<td class="normal-td td-center td-pos">
                                        1st                                            
      <div class="race-pos-no race-pos-no-2">2</div>
 </td>
</tr>

<tr class="order-by-pos" data-pos="2">
   <td class="normal-td td-center td-pos">
                                            2nd                                           
      <div class="race-pos-no race-pos-no-1">1</div>
   </td>
</tr>"""

soup = BeautifulSoup(html, "html.parser")
registros = soup.find_all(class_="order-by-pos")

for entrada in registros:
    trainer = entrada.td.div.text
    entrada.td.div.extract()
    saludos = entrada.td.get_text(strip=True)
    print (saludos, trainer)

给你:

1st 2
2nd 1