Question

我正在通过查看篮球统计资料来了解R，我想提取出铅球图中显示的信息。

我正在查看D'Angelo Russell的以下快照图，

https://www.basketball-reference.com/players/r/russeda01/shooting/2019

我正在使用library(rvest)软件包中的工具通过以下方式抓取数据：

> dlo_html <- read_html("https://www.basketball-reference.com/players/r/russeda01/shooting/2019")
> dlo_nodes1 <- html_nodes(dlo_html, "table")
> dlo_makes <- html_table(dlo_nodes1)

...因此现在当我运行head(dlo_makes)时，我得到一个74行11列的data.frame，可以从网页左侧的表中进行排序。那是个不错的开始。

但是，我真正想要的是页面右侧击球图图形中包含的信息。我可以在html的源代码中看到它。如果您在源中搜索shot-area，则在其下方大约有1500行数据，如下所示：

<div style="top:57px;left:237px;" tip="Oct 17, 2018, BRK at DET<br>1st Qtr, 10:38 remaining<br>Missed 2-pointer from 2 ft<br>BRK leads 2-0" class="tooltip miss">&#215;</div>
<div style="top:154px;left:341px;" tip="Oct 17, 2018, BRK at DET<br>1st Qtr, 10:30 remaining<br>Made 2-pointer from 14 ft<br>BRK now leads 4-0" class="tooltip make">&#9679;</div>
etc.

我是否将错误的信息传递到html_nodes()命令中？还是应该使用不同于html_table的命令来查看节点？还是在这里我想念其他东西？

Answer 1

所需的数据将作为注释写入，不会动态加载。

我使用视图源来获取包含数据的div，它称为

全部快照

所以这是获取所需内容的代码

dlo_html <- read_html("https://www.basketball-reference.com/players/r/russeda01/shooting/2019")

Commented_Section <- dlo_html%>%html_nodes("[id = 'all_shot-chart']")%>%html_nodes(xpath = 'comment()')%>%
        html_text() %>% read_html() %>%html_node('table')

Missed_Plays <- Commented_Section %>% html_nodes("[class='tooltip miss']")
Maked_Plays <- Commented_Section %>% html_nodes("[class='tooltip make']")

我可以在这个问题中找到如何获得评论部分。

How to read a commented out HTML table using readHTMLTable in R

抓取坐标数据和有关拍摄位置的其他信息

1 个答案: