答案 0 :(得分:3)
使用crandb API https://github.com/metacran/crandb
网址调用https://crandb.r-pkg.org/MASS/all
在CLI上使用jq
curl https://crandb.r-pkg.org/MASS/all | jq '[.versions[]][0].Date'
#> "2009-05-06"
或者R
library(jqr)
curl::curl_download( 'https://crandb.r-pkg.org/MASS/all', (f <- tempfile()))
jq(paste0(readLines(f), collapse=""), "[.versions[]][0].Date")
#> "2009-05-06"
R中的OR与jsonlite
jsonlite::fromJSON("https://crandb.r-pkg.org/MASS/all")$versions[[1]]$Date
#> [1] "2009-05-06"
答案 1 :(得分:1)
与您的链接类似,您可以使用:
library(rvest)
library(dplyr)
nCRANArchived <- function(pkg) {
link <- read_html(paste0("https://cran.r-project.org/src/contrib/Archive/", pkg))
link %>%
html_node('table') %>%
html_table() %>%
select(`Last modified`) %>%
filter(`Last modified` != '') %>%
head(1)
}
nCRANArchived('dplyr')
# Last modified
#1 2014-01-29 21:24
nCRANArchived('data.table')
# Last modified
#1 2006-04-15 00:03
nCRANArchived('tableHTML')
# Last modified
#1 2016-06-26 09:30
答案 2 :(得分:1)
我更像是data.table
用户,所以这里的rvest
解决方案只有data.table
和anytime
:
R> url <- "https://cran.r-project.org/src/contrib/Archive/Rcpp"
R> dat <- setDT(html_table(html_node(read_html(url), "table")))
R> dat[, Date := anytime(`Last modified`)][ !is.na(Date), .(Name, Date)][1:10,]
Name Date
1: Rcpp_0.6.0.tar.gz 2008-11-06 19:15:00
2: Rcpp_0.6.1.tar.gz 2008-11-30 20:19:00
3: Rcpp_0.6.2.tar.gz 2008-12-04 09:19:00
4: Rcpp_0.6.3.tar.gz 2009-01-11 20:11:00
5: Rcpp_0.6.4.tar.gz 2009-03-02 09:54:00
6: Rcpp_0.6.5.tar.gz 2009-04-03 11:41:00
7: Rcpp_0.6.6.tar.gz 2009-08-04 16:22:00
8: Rcpp_0.6.7.tar.gz 2009-11-08 19:23:00
9: Rcpp_0.6.8.tar.gz 2009-11-10 11:15:00
10: Rcpp_0.7.0.tar.gz 2009-12-20 10:58:00
R>
到目前为止,Date
是一个POSIXct
,因此您还可以随意对其进行计算,排序,差异,聚合等。
答案 3 :(得分:1)
library(htmltab)
foo = function(pkg){
url = paste0("https://cran.r-project.org/src/contrib/Archive/", pkg, "/")
suppressWarnings(min(na.omit(lubridate::ymd_hm(htmltab(url, 1)$`Last modified`))))
}
foo("ggplot2")
#[1] "2007-06-01 14:27:00 UTC"
foo("tidyverse")
#[1] "2016-09-09 18:07:00 UTC"