我对python 2.7很新,我有一个任务是读取URL中的表。
我从表中获取URL中的数据。现在的问题是,我只需要数据,但我也得到了标签。 请帮我。提前谢谢。
from bs4 import BeautifulSoup
import urllib2
response = urllib2.urlopen('https://www.somewebsite.com/')
html = response.read()
soup = BeautifulSoup(html)
tabulka = soup.find("table", {"class" : "defaultTableStyle tableFontMD tableNoBorder"})
records = []
for row in tabulka.findAll('tr'):
col = row.findAll('td')
print col
答案 0 :(得分:3)
您必须使用.text
属性
from bs4 import BeautifulSoup
import urllib2
response = urllib2.urlopen('https://www.somewebsite.com/')
html = response.read()
soup = BeautifulSoup(html)
tabulka = soup.find("table", {"class" : "defaultTableStyle tableFontMD tableNoBorder"})
records = []
for row in tabulka.findAll('tr'):
col = row.findAll('td')
print [coli.text for coli in col]