网页抓取问题(空列表)

时间:2021-05-13 16:11:18

标签: python web-scraping

我目前正在尝试获取某个玩家的排名,但总是返回空列表。 已经挣扎了一段时间,真的很感激一些帮助和未来项目解决这些问题的任何提示。或者一般更好理解beautifulsoup的地方

import requests as req
from bs4 import BeautifulSoup

id = "epic"
tag = "random"
url = f"https://rocketleague.tracker.network/rocket-league/profile/{id}/{tag.lower()}/overview"
html = req.get(url).content
soup = BeautifulSoup(html,"lxml")
line = soup.findAll("div",{"class":"rank"})
print(line)

这是我想要得到的:

Screenshot showing desired element

1 个答案:

答案 0 :(得分:0)

使用 requests 加载响应在这里不起作用,仅仅是因为站点动态加载内容,因此您需要使用一些 webdriver

这是我想出的:

from selenium import webdriver
from selenium.webdriver.firefox.options import Options

options = Options()
options.add_argument("--headless")
driver = webdriver.Firefox(executable_path=r"driver/geckodriver.exe", options=options)

id = "epic"
tag = "random"
url = f"https://rocketleague.tracker.network/rocket-league/profile/{id}/{tag.lower()}/overview"

driver.get(url)
driver.implicitly_wait(2) #allow some time to fully load, you may tweak accordingly
ranks = driver.find_elements_by_css_selector(r'[class="rank"]')

for i in ranks:
    print(i.text)
driver.quit()

结果:

#695,273 • Top 32%
#1,409,786 • Bottom 26%
#1,240,839 • Bottom 43%
#1,195,373 • Bottom 45%
#875,794 • Top 41%
#874,338 • Top 41%
#1,530,195 • Bottom 29%
#960,411 • Top 45%
Unranked Division I
#1,663,447 • Top 33%
Gold III Division III
#2,974,386 • Bottom 29%
Platinum II Division III
#2,879,016 • Bottom 40%
Platinum II Division II
#2,363,741 • Top 50%
Gold III Division III
#2,407,466 • Bottom 42%
Platinum II Division I
#2,054,002 • Top 48%
Gold II Division II
#2,321,030 • Bottom 41%