这里是我的代码:
import requests
from bs4 import BeautifulSoup
url= 'http://someurl'
r = requests.get(url)
soup = BeautifulSoup(r.text,"lxml")
如果我从该网址中提取表格
for loc in soup.find('table', attrs={'class':'class1'}):
print loc.text
将导致:
(1:1:1) text that having bold attribute, example BEE and ANT
(1:1:2) other text is a CAT
(1:1:3) here a DOG
(1:1:4) else is a TIGER
(1:1:1)等等有一个类:跨度上的addrs所以如果这样做代码:
tabel= soup.find('table', attrs={'class':'class1'})
for lok in tabel.findAll('span', attrs={'class':'addrs'}):
print lok.text
将导致:
(1:1:1)
(1:1:2)
(1:1:3)
(1:1:4)
和代码:
for loki in tabel.findAll('b'):
print loki.text
将导致: 蜜蜂 蚂蚁 猫 狗 TIGER
我想要的文字是,
(1:1:1) BEE - ANT
(1:1:2) CAT
(1:1:3) DOG
(1:1:4) TIGER