R ReadHTMLTable Pro足球裁判团队进攻

时间:2018-12-16 22:28:32

标签: r

我正在尝试将“ Team Offense”表添加到R中。我尝试了多种技术,但似乎无法使其正常工作。看起来R仅在读取前两个表。链接在下面。

https://www.pro-football-reference.com/years/2018/index.htm

这是我尝试过的...

   library(XML)
TeamData = 'https://www.pro-football-reference.com/years/2018/index.htm'TeamData = 'https://www.pro-football-reference.com/years/2018/index.htm'
URL = TeamData
URLdata = getURL(URL)
table = readHTMLTable(URLdata, stringsAsFactors=F, which = 5)

1 个答案:

答案 0 :(得分:0)

抓取体育参考网站可能很棘手,但它们是很好的来源:

library(rvest)
library(httr)

link <- "https://www.pro-football-reference.com/years/2018/index.htm"

doc <- GET(link)

cont <- content(doc, "text") %>% 
  gsub(pattern = "<!--\n", "", ., fixed = TRUE) %>% 
  read_html %>% 
  html_nodes(".table_outer_container table") %>% 
  html_table()

# Team Offense table is the fifth one
df <- cont[[5]]