您好我正在为我的学校开展一个项目,涉及清除HTML。
然而,当我寻找桌子时,我没有得到任何回报。以下是遇到此问题的细分受众群。
如果您需要更多信息,我很乐意将其提供给您
from bs4 import BeautifulSoup
import urllib2
import datetime
#This section determines the date of the next Saturday which will go onto the end of the URL
d = datetime.date.today()
while d.weekday() != 5:
d += datetime.timedelta(1)
#temporary logic for testing when next webpage isn't out
d = "2013-06-01"
#Section that scrapes the data off the webpage
url = "http://www.sydgram.nsw.edu.au/co-curricular/sport/fixtures/" + str(d) + ".php"
page = urllib2.urlopen(url)
soup = BeautifulSoup(page)
print soup
#Section that grabs the table with stuff in it
table = soup.find('table', {"class": "excel1"})
print table
答案 0 :(得分:0)
BeautifulSoup期待一个HTML字符串。你提供的是一个响应对象。
从响应中获取html:
html = page.read()
然后将html移到beautifulsoup或直接传递给你。
另外,建议您阅读以下两个链接: