使用beufitulsoup从类表和具有粗体值的文本中获取文本

时间:2016-12-06 05:12:04

标签: python parsing

这里是我的代码:

import requests
from bs4 import BeautifulSoup
url= 'http://someurl'
r = requests.get(url)
soup = BeautifulSoup(r.text,"lxml")

如果我从该网址中提取表格

for loc in soup.find('table', attrs={'class':'class1'}):
     print loc.text

将导致:

(1:1:1) text that having bold attribute, example BEE and ANT
(1:1:2) other text is a CAT
(1:1:3) here a DOG
(1:1:4) else is a TIGER

(1:1:1)等等有一个类:跨度上的addrs所以如果这样做代码:

tabel= soup.find('table', attrs={'class':'class1'})
 for lok in tabel.findAll('span', attrs={'class':'addrs'}):
    print lok.text

将导致:

(1:1:1)
(1:1:2)
(1:1:3)
(1:1:4)

和代码:

for loki in tabel.findAll('b'):
    print loki.text

将导致:     蜜蜂     蚂蚁     猫     狗     TIGER

我想要的文字是,

(1:1:1) BEE - ANT
(1:1:2) CAT
(1:1:3) DOG
(1:1:4) TIGER

0 个答案:

没有答案