Question

我一直在使用带有httr和plyr库的R从API中抓取数据。它很直接，适用于以下代码：

library(httr)
library(plyr)

headers <- c("Accept" = "application/json, text/javascript",
         "Accept-Encoding" = "gzip, deflate, sdch",
         "Connection" = "keep-alive",
         "Referer" = "http://www.afl.com.au/stat",
         "Host" = "www.afl.com.au",
         "User-Agent" = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.118 Safari/537.36",
         "X-Requested-With"= "XMLHttpRequest",
         "X-media-mis-token" = "f31fcfedacc75b1f1b07d5a08887f078")

query <- GET("http://www.afl.com.au/api/cfs/afl/season?seasonId=CD_S2016014", add_headers(headers))

stats <- httr::content(query)

我的问题是关于标题中所需的请求令牌（即X-media-mis-token）。通过检查Chrome或Firefox中的XHR元素很容易手动获取，但令牌每24小时更新一次，这使得手动提取变得很麻烦。

是否可以查询网页并使用R？

自动提取此令牌

Answer 1

您可以获得X-media-mis-token令牌，但附带免责声明。 ;）

library(httr)
token_url <- 'http://www.afl.com.au/api/cfs/afl/WMCTok'
token <- POST(token_url, encode="json")
content(token)$token
#[1] "f31fcfedacc75b1f1b07d5a08887f078"
content(token)$disclaimer
#[1] "All content and material contained within this site is protected by copyright owned by or licensed to Telstra. Unauthorised reproduction, publishing, transmission, distribution, copying or other use is prohibited.

使用R从XHR请求标头中提取令牌

1 个答案: