我试图从这个网站中提取JSON,但它没有打印数据。我不确定是否是代码失败或网站。这是代码:
import requests
season = '2016-17'
player_id = 202322
base_url = "http://stats.nba.com/stats/shotchartdetail?CFID=33&CFPARAMS=%s&ContextFilter=&ContextMeasure=FGA&DateFrom=&DateTo=&GameID=&GameSegment=&LastNGames=0&LeagueID=00&Location=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&PaceAdjust=N&PerMode=PerGame&Period=0&PlayerID=%s&PlusMinus=N&PlayerPosition=&Rank=N&RookieYear=&Season=%s&SeasonSegment=&SeasonType=Regular+Season&TeamID=0&VsConference=&VsDivision=&mode=Advanced&showDetails=0&showShots=1&showZones=0"
shot_chart_url = base_url % (season, player_id, season)
user_agent = 'User-Agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.111 Safari/537.36'
response = requests.get(shot_chart_url, headers={'User-Agent': user_agent})
headers = response.json()['resultSets'][0]['headers']
print(headers)
答案 0 :(得分:2)
我可以通过更改一些内容来运行脚本:
season = '2016-17'
player_id = 202322
base_url = "http://stats.nba.com/stats/shotchartdetail?CFID=33&CFPARAMS=%s&ContextFilter=&ContextMeasure=FGA&DateFrom=&DateTo=&GameID=&GameSegment=&LastNGames=0&LeagueID=00&Location=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&PaceAdjust=N&PerMode=PerGame&Period=0&PlayerID=%s&PlusMinus=N&PlayerPosition=&Rank=N&RookieYear=&Season=%s&SeasonSegment=&SeasonType=Regular+Season&TeamID=0&VsConference=&VsDivision=&mode=Advanced&showDetails=0&showShots=1&showZones=0"
shot_chart_url = base_url % (season, player_id, season)
user_agent = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36'
headers = {
'User-Agent': user_agent,
'x-nba-stats-origin': 'stats',
'x-nba-stats-token': 'true',
'Referer': 'http://stats.nba.com/events/',
}
response = requests.get(shot_chart_url, headers=headers)
headers = response.json()['resultSets'][0]['headers']
print(headers)
我在nba网站内管理了一个页面,该页面使用与您相同的api端点,在检查了我对服务器的请求之后我做了这个:
Referer
个标头 - 许多服务器需要它(这个是ASP.NET
,根据我的经验,他们确实需要它。x-nba-stats
我认为用户代理是最重要的一个,我觉得他们阻止了ip +用户代理组合,
编辑:在解决此问题时,只需分享我的思维方式 我在评论中看到这实际上适用于浏览器,了解HTTP的工作原理可能与以下内容有关:cookies / headers / url params。我跳过原来的网站并搜索这个端点,确实它对我有用,我用chrome的DevTools检查http请求,并用请求模仿请求:)