将列表列表添加到R中单列的数据框中

时间:2016-01-10 10:52:22

标签: r dataframe rvest

我正在使用r中的以下代码来抓取网页信息:

library(rvest)
crickbuzz <- read_html(httr::GET("http://www.cricbuzz.com/cricket -match/live-scores"))
matches_dates <- crickbuzz %>%
html_nodes(".schedule-date:nth-child(1)")%>%
html_attr("timestamp")

matches_dates
[1] "1452268800000" "1452132000000" "1452247200000" "1452242400000" "1452327000000" "1452290400000" "1452310200000" "1452310200000" "1452310200000"
[10] "1452310200000" "1452324600000" "1452324600000" "1452324600000" "1452324600000" "1452324600000" "1452150000000" "1452153600000" "1452153600000"

现在我将其转换为正确的日期和时间格式

dates <- lapply(X = matches_date ,  function(timestamp_match){
 (as.POSIXct(as.numeric(timestamp_match)/1000, origin="1970-01-01")) })

现在我的日期格式如下:

     dates
[[1]]
[1] "2016-01-10 07:30:00 IST"

[[2]]
[1] "2016-01-10 21:30:00 IST"

[[3]]
[1] "2016-01-09 12:00:00 IST"

[[4]]
[1] "2016-01-10 13:55:00 IST"

[[5]]
[1] "2016-01-10 10:50:00 IST"

[[6]]
[1] "2016-01-07 12:30:00 IST"

[[7]]
[1] "2016-01-07 13:30:00 IST"

[[8]]
[1] "2016-01-10 09:00:00 IST"

[[9]]
[1] "2016-01-10 09:00:00 IST"

[[10]]
[1] "2016-01-10 09:00:00 IST"

[[11]]
[1] "2016-01-10 09:00:00 IST"

[[12]]
[1] "2016-01-10 09:00:00 IST"

[[13]]
[1] "2016-01-10 13:00:00 IST"

[[14]]
[1] "2016-01-10 13:00:00 IST"

[[15]]
[1] "2016-01-10 13:00:00 IST"

[[16]]
[1] "2016-01-10 13:00:00 IST"

[[17]]
[1] "2016-01-10 03:30:00 IST"

[[18]]
[1] "2016-01-10 03:30:00 IST"

现在我将其附加到一列数据框:

matches_info [,“日期和时间”]&lt; - dates

但是只有第一个日期被复制到整个列并给出以下警告。

Warning message:
 In `[<-.data.frame`(`*tmp*`, , "Date And Time", value = list(1452391200,  :
 provided 18 variables to replace 1 variables

如果我要取消列表(日期),它会再次给我时间戳。我怎样才能提取日期和时间?

1 个答案:

答案 0 :(得分:1)

尝试使用do.call(c, dates)代替unlist(dates),以防止R将列表元素转换为数字并将其保留为POSIXct:

matches_date <- c("1452268800000", "1452132000000")
dates <- lapply(X = matches_date ,  function(timestamp_match){
 (as.POSIXct(as.numeric(timestamp_match)/1000, origin="1970-01-01")) })
do.call(c, dates)
# [1] "2016-01-08 17:00:00 CET" "2016-01-07 03:00:00 CET"

matches_info[,"Date And Time"] <- do.call(c, dates)

或只是

matches_date <- c("1452268800000", "1452132000000")
matches_info[,"Date And Time"] <- as.POSIXct(as.numeric(matches_date)/1000, origin="1970-01-01")