Question

您好我正在为我的学校开展一个项目，涉及清除HTML。

然而，当我寻找桌子时，我没有得到任何回报。以下是遇到此问题的细分受众群。

如果您需要更多信息，我很乐意将其提供给您

from bs4 import BeautifulSoup
import urllib2
import datetime

#This section determines the date of the next Saturday which will go onto the end of     the URL 
d = datetime.date.today() 
while d.weekday() != 5:
    d += datetime.timedelta(1)

#temporary logic for testing when next webpage isn't out
d = "2013-06-01"

#Section that scrapes the data off the webpage
url = "http://www.sydgram.nsw.edu.au/co-curricular/sport/fixtures/" + str(d) + ".php"
page = urllib2.urlopen(url)
soup = BeautifulSoup(page)
print soup
#Section that grabs the table with stuff in it
table = soup.find('table', {"class": "excel1"})
print table

Answer 1

BeautifulSoup期待一个HTML字符串。你提供的是一个响应对象。

从响应中获取html：

 html = page.read()

然后将html移到beautifulsoup或直接传递给你。

另外，建议您阅读以下两个链接：

urllib2 documentation

BeautifulSoup documentation

美丽的汤什么也没有回来

1 个答案: