Question

我正在尝试用漂亮的汤刮一个网站。我可以导航到类对象但是可以进入下一级别来获取我想要的文本。

到目前为止我已经

了

soup = BeautifulSoup(urllib2.urlopen('URL...').read())

comment = soup('div', {'class' : 'PanelDarkBackground'})
print comment

只输出整个班级（如下）。我想提取0-0，这是在代码中的tr＆gt; td id =“event”部分

任何建议......？

[<div class="PanelDarkBackground" id="Event-Basic-Info" style="margin-bottom: 10px">
<div style="height: 70px; width: 100%;">
<div style="height: 70px; width: 70px; float: left; background-color: white">
<img height="70" src="ss" width="70"/>
</div>
<div style="width: 450px; float: left; height: 70px; display: table">
<table border="0" cellpadding="0" cellspacing="0" style="font-family: tahoma; font-size:      18pt; font-weight: bold; color: white;" width="450px">

    <tr>
      <td align="center" height="70" style="font-family: tahoma; font-size: 18pt; font-weight:    bold; color: white;" valign="middle" width="197">seveal</td>
      <td align="center" id="event" style="font-family: tahoma; font-size: 18pt; font- weight: bold; color: white;" valign="middle">0-0</td>
      <td align="center" style="font-family: tahoma; font-size: 18pt; font-weight: bold; color: white;" valign="middle" width="197">seveal</td>
    </tr>
 </table>
</div>
<div style="height: 70px; width: 70px; float: right; background-color: white">
<img height="70" src="" width="70"/>
</div>
</div>
</div>]

Answer 1

直接转到td。

print soup('td',{'id':'event'})

仅针对您可以执行的td内容：

print soup('td',{'id':'event'})[0].contents[0]

网页刮痧与beautifulsoup-导航

1 个答案: