由于某种原因,我无法通过ID查找表或通过ID选择表。.我一直在参考BS的文档,据我所知它应该可以工作。.
以下是尝试通过ID“ per_game”选择表的代码示例,content.find(id ='per_game')也不适合我。
我一直在参考文档的find和CSS选择器部分,在这里:https://www.crummy.com/software/BeautifulSoup/bs4/doc/#find
import requests
import csv
import calendar
from datetime import date, datetime, timedelta
from collections import OrderedDict, defaultdict
from bs4 import BeautifulSoup as soup
season = str(date.today().year + 1)
month = calendar.month_name[date.today().month].lower()
teamUrl = "https://basketball-reference.com/teams/"
urls = [teamUrl + 'ATL/' + season +'.html'] # Atlanta Hawks
# teamUrl + 'BOS/' + season +'.html', # Boston Celtics
# teamUrl + 'BKN/' + season +'.html', # Brooklyn Nets
# teamUrl + 'CHA/' + season +'.html', # Charlotte Hornets
for url in urls:
page = requests.get(url, headers={'User-Agent': 'Mozilla/5.0'})
content = soup(page.content, 'html.parser')
table = content.select("#per_game")
print(table)
非常感谢, OM。
答案 0 :(得分:0)
这不是Ajax,只需从html中删除注释
page = requests.get(url, headers={'User-Agent': 'Mozilla/5.0'})
html_doc = page.text.replace('<!--', '').replace('-->', '')
content = soup(html_doc, 'html.parser')