我正在尝试从https://www.iplt20.com/teams/sunrisers-hyderabad/squad"获取数据 但面临一个问题,尤其是下拉列表(“ 按年份过滤”)。
i能够在下拉列表(即2020,2019,)中检索名称。等等 但无法检索每个列表元素的数据。
当我们点击按年份过滤列表时,会出现一个下拉列表,然后按季节(年份),更改球员(我们将获得该年份的球员以及摘要)。 我想按季节获取每个球员的数据。 当我们单击下拉列表时,也不会创建新的URL
我找不到任何解决方案。 使用以下python代码从下拉列表中检索季节/年份值。
Python代码
squad_url= "https://www.iplt20.com/teams/sunrisers-hyderabad/squad"
driver = webdriver.Chrome(executable_path=".\chromedriver.exe")
driver.get(squad_url)
html = driver.page_source
soup2 = BeautifulSoup(''.join(html), 'html.parser')
for llist in soup2.find_all("ul",class_="drop-down__dropdown-list"):
for year in llist.find_all("li"):
print(year.text)
下拉列表的html代码段如下
<div class="large-squad-list__filter single-filter">
<div class="stats-table__filter drop-down js-drop-down is-open">
<div class="drop-down__clickzone js-dropdown-trigger" tabindex="0" role="button"></div>
<div class="drop-down__label js-drop-down-label">Filter by Year</div>
<div class="drop-down__current js-drop-down-current">2020</div>
<ul class="drop-down__dropdown-list js-drop-down-options">
<li tabindex="0" role="button" class="drop-down__dropdown-list__option" data-option="ipl2020">2020</li>
<li tabindex="0" role="button" class="drop-down__dropdown-list__option" data-option="ipl2019">2019</li>
<li tabindex="0" role="button" class="drop-down__dropdown-list__option" data-option="ipl2018">2018</li>
<li tabindex="0" role="button" class="drop-down__dropdown-list__option" data-option="ipl2017">2017</li>
<li tabindex="0" role="button" class="drop-down__dropdown-list__option" data-option="ipl2016">2016</li>
<li tabindex="0" role="button" class="drop-down__dropdown-list__option" data-option="ipl2015">2015</li>
<li tabindex="0" role="button" class="drop-down__dropdown-list__option" data-option="ipl2014">2014</li>
<li tabindex="0" role="button" class="drop-down__dropdown-list__option" data-option="ipl2013">2013</li>
<li tabindex="0" role="button" class="drop-down__dropdown-list__option" data-option="ipl2012">2012</li>
<li tabindex="0" role="button" class="drop-down__dropdown-list__option" data-option="ipl2011">2011</li>
<li tabindex="0" role="button" class="drop-down__dropdown-list__option" data-option="ipl2010">2010</li>
<li tabindex="0" role="button" class="drop-down__dropdown-list__option" data-option="ipl2009">2009</li>
<li tabindex="0" role="button" class="drop-down__dropdown-list__option" data-option="ipl2008">2008</li>
</ul>
</div>
</div>
答案 0 :(得分:0)
这可能仅是部分答案,但确实可以得到您想要的(统计信息)。
问题在于,数据是由JS动态加载的。但是,如果您查看流量(请检查开发人员工具->网络),则会看到该请求已发送到API。
您可以获取该URL并解析响应。
这正是代码的作用:
import requests
_ids = [18790, 10192, 7749, 5815, 3957, 2785, 2374, 605]
for _id in _ids:
url = f"https://cricketapi.platform.iplt20.com/stats/" \
f"players?teamIds=62&tournamentIds={_id}&scope=TOURNAMENT&pageSize=30"
response = requests.get(url).json()
print(f"Printing stats for {response['team']['fullName']}")
for player in response['stats']['content']:
print(f"{player['player']['fullName']} - {player['stats']}")
但是,我无法确定tournamentIds
的来源。此外,没有2013年的数据。
示例输出(为简便起见,仅输出一部分):
Printing stats for Sunrisers Hyderabad
Yusuf Pathan - [{'matchType': 'AGG', 'battingStats': {'50s': 0, '100s': 0, 'inns': 8, 'm': 10, 'r': '40', 'b': 45, '4s': 1, '6s': 1, 'no': 5, 'hs': '16*', 'sr': '88.88', 'a': '13.33'}, 'bowlingStats': {'bbiw': 0, 'bbir': 0, 'bbmw': 0, 'bbmr': 0, '4w': 0, '5w': 0, '10w': 0, 'inns': 1, 'm': 10, 'b': 6, 'r': 8, 'wb': 0, 'nb': 0, 'd': 1, 'w': 0, '4s': 1, '6s': 0, 'maid': 0, 'wmaid': 0, 'ht': 0, 'a': '-', 'e': '8.00', 'sr': '-', 'o': '1.00'}, 'fieldingStats': {'c': 1, 'ro': 0, 's': 0, 'inns': 1, 'm': 10}}, {'matchType': 'IPLT20', 'battingStats': {'50s': 0, '100s': 0, 'inns': 8, 'm': 10, 'r': '40', 'b': 45, '4s': 1, '6s': 1, 'no': 5, 'hs': '16*', 'sr': '88.88', 'a': '13.33'}, 'bowlingStats': {'bbiw': 0, 'bbir': 0, 'bbmw': 0, 'bbmr': 0, '4w': 0, '5w': 0, '10w': 0, 'inns': 1, 'm': 10, 'b': 6, 'r': 8, 'wb': 0, 'nb': 0, 'd': 1, 'w': 0, '4s': 1, '6s': 0, 'maid': 0, 'wmaid': 0, 'ht': 0, 'a': '-', 'e': '8.00', 'sr': '-', 'o': '1.00'}, 'fieldingStats': {'c': 1, 'ro': 0, 's': 0, 'inns': 1, 'm': 10}},
答案 1 :(得分:0)
经过一整天的努力,我能够做到:
在代码中导入以下内容
### Shiny Inputs
library(shiny)
balancedSliderInput <- function(inputId, value = 0, label = "",
group = "", width = "100%") {
if (label != "")
label <- paste0('<label class="control-label" for="', inputId, '">', label, '</label>')
balanced_slider_tag <- tagList(
div(style = paste("width: ", width), class = "all-balanced-slider",
HTML(label),
div(id = inputId, class = paste("balanced-slider", group), as.character(value)),
span(class = "value", "0"),
HTML("%")
)
)
dep <- list(
htmltools::htmlDependency("balanced_slider", "0.0.2", c(file = "www"),
script = c("js/jquery-ui.min.js", "js/balancedSlider.js"),
stylesheet = c("css/jquery-ui.min.css")
)
)
htmltools::attachDependencies(balanced_slider_tag, dep)
}
updateBalancedSliderInput <- function(session, inputId, value = 0) {
message <- list(value = value)
session$sendInputMessage(inputId, message)
}
registerInputHandler("balancedSlider", function(data, ...) {
if (is.null(data))
NULL
else
data
}, force = TRUE)
########## App ------
ui <- fixedPage(
actionButton("reset", "Reset", icon = icon("undo-alt")),
balancedSliderInput("test1", label = "Test1", value = 50),
balancedSliderInput("test2", label = "Test2", value = 50),
textOutput("test")
)
server <- function(session, input, output) {
test_reactive <- reactive({
return(input$test1)
})
output$test <- renderText({
test <- paste("Sluder 1 is at", test_reactive()[[1]])
return(test)
})
observeEvent(input$reset, {
updateBalancedSliderInput(session, "test1", 50)
updateBalancedSliderInput(session, "test2", 50)
})
}
shinyApp(ui, server)
下面的Python代码
$(function() {
$('.balanced-slider').each(function() {
console.log("Running Log 1")
var init_value = parseInt($(this).text());
$(this).siblings('.value').text(init_value);
$(this).empty().slider({
value: init_value,
min: 0,
max: 100,
range: "max",
step: 0.5,
animate: 0,
slide: function(event, ui) {
console.log("Log 10");
// Update display to current value
$(this).siblings('.value').text(ui.value);
// Get current total
var total = ui.value;
var sibling_count = 0;
var classes = $(this).attr("class").split(/\s+/);
var selector = ' .' + classes.join('.');
//console.log(selector);
var others = $(selector).not(this);
others.each(function() {
total += $(this).slider("option", "value");
sibling_count += 1;
});
//console.log(total);
var delta = total - 100;
var remainder = 0;
// Update each slider
others.each(function() {
console.log("Running Log 2")
var t = $(this);
var current_value = t.slider("option", "value");
var new_value = current_value - delta / sibling_count;
if (new_value < 0) {
remainder += new_value;
new_value = 0;
}
t.siblings('.value').text(new_value.toFixed(1));
t.slider('value', new_value);
});
if(remainder) {
var pos_val_count = 0;
others.each(function() {
if($(this).slider("option", "value") > 0)
pos_val_count += 1;
});
others.each(function() {
if($(this).slider("option", "value") > 0) {
var t = $(this);
var current_value = t.slider("option", "value");
var new_value = current_value + remainder / pos_val_count;
t.siblings('.value').text(new_value.toFixed(1));
t.slider('value', new_value);
}
});
}
},
// fire the callback event for the other sliders
stop: function(event, ui) {
var classes = $(this).attr("class").split(/\s+/);
var selector = '.' + classes.join('.');
$(selector).not(this).each(function() {
$(this).trigger("slidestop");
});
}
});
});
});
var balancedSliderBinding = new Shiny.InputBinding();
$.extend(balancedSliderBinding, {
find: function(scope) {
return $(scope).find(".balanced-slider");
},
// The input rate limiting policy
getRatePolicy: function() {
return {
// Can be 'debounce' or 'throttle'
policy: 'debounce',
delay: 500
};
},
getType: function() {
return "balancedSlider";
},
getValue: function(el) {
var obj = {};
obj[$(el).attr("id")] = $(el).slider("option", "value");
return obj;
},
setValue: function(el, new_value) {
$(el).slider('value', new_value);
$(el).siblings('.value').text(new_value);
},
subscribe: function(el, callback) {
$(el).on("slidestop.balancedSliderBinding", function(e) {
callback(); // add true parameter to enable rate policy
});
},
unsubscribe: function(el) {
$(el).off(".balancedSliderBinding");
},
// Receive messages from the server.
// Messages sent by updateUrlInput() are received by this function.
receiveMessage: function(el, data) {
if (data.hasOwnProperty('value'))
this.setValue(el, data.value);
$(el).trigger('change');
},
});
Shiny.inputBindings.register(balancedSliderBinding, "balancedSliderBinding");
以上逻辑仅包含有关如何单击下拉列表的信息。 但与此同时,动态数据正在加载(我已经从driver.page_source输出中确认)
具有请求库的问题是未加载动态数据。 但是使用硒可以轻松做到这一点。
我添加了睡眠以确保滚动完成。 我在很多地方阅读而不是睡觉可以使用 WebDriverWait ,但是我无法使其工作
我非常确定可以优化版本。 (如果找到一个,请在此处发布)