Question

我对python 2.7很新，我有一个任务是读取URL中的表。

我从表中获取URL中的数据。现在的问题是，我只需要数据，但我也得到了标签。请帮我。提前谢谢。

from bs4 import BeautifulSoup
import urllib2


    response = urllib2.urlopen('https://www.somewebsite.com/')
    html = response.read()
    soup = BeautifulSoup(html)

    tabulka = soup.find("table", {"class" : "defaultTableStyle tableFontMD tableNoBorder"})



    records = [] 
    for row in tabulka.findAll('tr'):
        col = row.findAll('td')

        print col

Answer 1

您必须使用.text属性

from bs4 import BeautifulSoup
import urllib2


response = urllib2.urlopen('https://www.somewebsite.com/')
html = response.read()
soup = BeautifulSoup(html)

tabulka = soup.find("table", {"class" : "defaultTableStyle tableFontMD tableNoBorder"})



records = [] 
for row in tabulka.findAll('tr'):
    col = row.findAll('td')

    print [coli.text for coli in col]

如何从urllib2中获取python中url的特定标记数据

1 个答案: