Python BeautifulSoup-通过ID返回“无”来查找表

时间:2018-11-15 01:41:06

标签: python beautifulsoup

由于某种原因,我无法通过ID查找表或通过ID选择表。.我一直在参考BS的文档,据我所知它应该可以工作。.

以下是尝试通过ID“ per_game”选择表的代码示例,content.find(id ='per_game')也不适合我。

我一直在参考文档的find和CSS选择器部分,在这里:https://www.crummy.com/software/BeautifulSoup/bs4/doc/#find

import requests
import csv
import calendar
from datetime import date, datetime, timedelta
from collections import OrderedDict, defaultdict
from bs4 import BeautifulSoup as soup

season = str(date.today().year + 1)
month = calendar.month_name[date.today().month].lower()

teamUrl = "https://basketball-reference.com/teams/"

urls       =    [teamUrl + 'ATL/' + season +'.html'] # Atlanta Hawks
                 # teamUrl + 'BOS/' + season +'.html', # Boston Celtics
                 # teamUrl + 'BKN/' + season +'.html', # Brooklyn Nets
                 # teamUrl + 'CHA/' + season +'.html', # Charlotte Hornets

for url in urls:
    page = requests.get(url, headers={'User-Agent': 'Mozilla/5.0'})
    content = soup(page.content, 'html.parser')
    table = content.select("#per_game")
    print(table)

非常感谢, OM。

1 个答案:

答案 0 :(得分:0)

这不是Ajax,只需从html中删除注释

page = requests.get(url, headers={'User-Agent': 'Mozilla/5.0'})
html_doc = page.text.replace('<!--', '').replace('-->', '')
content = soup(html_doc, 'html.parser')