我是 R 的新手。
我需要帮助将网络抓取数据分配给“薪水”。不知何故,我的变量“salary”在我的环境中显示字符(空)。我已经使用 SelectorGadget 来查找 html 节点。
如果有人能向我解释一下,我将不胜感激。谢谢!
library(rvest)
library(tidyverse)
library(magrittr)
nba_player_salaries <- read_html("https://hoopshype.com/salaries/players/2018-2019/")
salary <- nba_player_salaries %>%
html_nodes("tbody .hh-salaries-sorted") %>%
html_text2()
答案 0 :(得分:0)
可以直接从页面中提取表格:
library(rvest)
library(dplyr)
url <- 'https://hoopshype.com/salaries/players/2018-2019/'
url %>%
read_html() %>%
html_table() %>%
.[[1]] %>%
setNames(.[1, ]) %>% #Since column names are in 1st row
slice(-1) %>% #Remove 1st row
select(-1) #Remove 1st column
# Player `2018/19` `2018/19(*)`
# <chr> <chr> <chr>
# 1 Stephen Curry $37,457,154 $38,320,489
# 2 Russell Westbrook $35,665,000 $36,487,029
# 3 Chris Paul $35,654,150 $36,475,929
# 4 LeBron James $35,654,150 $36,475,929
# 5 Kyle Lowry $32,700,000 $33,453,690
# 6 Blake Griffin $31,873,932 $32,608,582
# 7 Gordon Hayward $31,214,295 $31,933,741
# 8 James Harden $30,570,000 $31,274,596
# 9 Paul George $30,560,700 $31,265,082
#10 Mike Conley $30,521,115 $31,224,584
# … with 566 more rows