我尝试了以下代码,但该功能无法正常工作,给我一个错误。
“ AttributeError:'NoneType'对象没有属性'findNextSiblings'”
该如何解决此错误?
我尝试删除h_span
,w_span
变量并在循环而不是soup.findNextSibling
中调用h_span.findNextSibling
函数,它只返回一个空字符串,代码确实起作用。
from selenium import webdriver
from bs4 import BeautifulSoup
import requests
import os
driver = webdriver.Chrome(executable_path= r'E:/Summer/FirstThings/Web scraping (bucky + pdf)/webscraping/tutorials-master/chromedriver.exe')
url = 'https://www.nba.com/players/aron/baynes/203382'
driver.get(url)
soup = BeautifulSoup(driver.page_source , 'lxml')
height = ''
h_span = soup.find('p', string = 'HEIGHT')
for span in h_span.findNextSiblings():
height = height + span.text
weight = ''
w_span = soup.find('p', string = 'WEIGHT')
for span in w_span.findNextSiblings():
weight = weight + span.text
born = ''
b_span = soup.find('p', string = 'BORN')
for span in b_span.findNextSiblings():
born = born + span.text
print(height)
print("")
print(weight)
print("")
print(born)
driver.__exit__()
它应该以标题本身的文本格式返回球员身高体重和出生信息。
答案 0 :(得分:1)
我喜欢处理体育数据!
您在这里做太多的工作。无需使用Selenium或BeautifulSoup来解析html,因为nba.com以不错的json格式提供了此数据。您所需要做的就是找到所需的球员,然后提取所需的数据:
from bs4 import BeautifulSoup
import requests
url = 'https://data.nba.net/prod/v1/2018/players.json'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36'}
jsonData = requests.get(url).json()
find_player = 'Baynes'
for player in jsonData['league']['standard']:
if player['lastName'] == find_player:
name = player['firstName'] + ' ' + player['lastName']
height = player['heightFeet'] + 'ft ' + player['heightInches'] + 'in'
weight = player['weightPounds'] + 'lbs'
born = player['dateOfBirthUTC']
print ('%s\n%s\n%s\n%s\n' %(name, height, weight, born))
输出:
Aron Baynes
6ft 10in
260lbs
1986-12-09