我尝试了一堆不同的代码组合来尝试从表中获取表数据。简单地使用soup.table
不会从页面返回此特定表格,我无法弄清楚原因。
我已经设法通过使用class_:'table assessment-item'
按类找到它,但是当我尝试解析单个行或数据时,它会抛出错误。
import requests
from bs4 import BeautifulSoup
page = requests.get("https://www.qut.edu.au/study/unit?unitCode=IFB104")
soup = BeautifulSoup(page.content, 'html.parser')
table = soup.find_all(class_='table assessment-item')
table_data = table.find_all('td')
错误:
Traceback (most recent call last):
File "/Users/study/Desktop/QUT Final/demo.py", line 7, in <module>
table_data = table.find_all('td')
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/bs4/element.py", line 1807, in __getattr__
"ResultSet object has no attribute '%s'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?" % key
答案 0 :(得分:4)
返回了三个表,其中类为'table assessment-item'
。
你只需要迭代它们:
table = soup.find_all(class_='table assessment-item')
table_data = [tbl.find_all('td') for tbl in table]
print table_data