我正在使用BeautifulSoup从纳斯达克网站上提取股票信息。我想专门从HTML页上的表格行中检索信息,但是我总是遇到错误(第12行)。
#import html-parser
from bs4 import BeautifulSoup
from requests import get
url = 'https://www.nasdaq.com/symbol/amzn' #AMZN is just an example
response = get(url)
#Create parse tree (BeautifulSoup Object)
soup = BeautifulSoup(response.text, 'html.parser')
data = soup.find_all(class_= 'column span-1-of-2')
table = data.find(class_= 'table-row') #This is where the error occurs
print(table)
答案 0 :(得分:0)
您可以执行类似的操作以从表行中获取数据。
import requests
from bs4 import BeautifulSoup
import re
r = requests.get("https://www.nasdaq.com/")
print(r)
soup = BeautifulSoup(r.content, 'html.parser')
data = soup.find('table',{'id':'indexTable', 'class':'floatL marginB5px'}).script.text
matches = re.findall(r'nasdaqHomeIndexChart.storeIndexInfo(.*);\r\n', data)
table_rows = [re.findall(r'\".*\"', row) for row in matches]
print(table_rows)
table_rows
是包含表数据的列表的列表。