用BeautifulSoup解析表的列

时间:2017-08-11 16:47:08

标签: python parsing beautifulsoup

我用这个模式代码解析表:

soup = BeautifulSoup(open("out.html"), 'html.parser')
tab = soup.findAll('table')[3] 
rows = tab.find_all('tr')

for sing_row in rows:
    col = sing_row.find_all('td')[1]
    print col 

印刷的结果是:

<td class="col-md-3">5.67.43.158<br/><span style="font-size: 0.9em; color: #eee;"></span></td>
<td class="col-md-3">32.54.44.155<br/><span style="font-size: 0.9em; color: #eee;">ns2.asdf.it</span></td>
<td class="col-md-3">53.64.21.154<br/><span style="font-size: 0.9em; color: #eee;">server1.adb.it</span></td>
<td class="col-md-3">23.62.53.22<br/><span style="font-size: 0.9em; color: #eee;">server1.xcvf.it</span></td> 

我的目标是只从表列中获取IP地址而不使用跨域内的域。我该怎么办?

1 个答案:

答案 0 :(得分:0)

您可以使用tag.contents

for sing_row in rows:
    col = sing_row.find_all('td')[1]
    print col.contents

您还可以使用tag.find(text=True)

for sing_row in rows:
    col = sing_row.find_all('td')[1]
    print col.find(text=True)