R根据另一列中最早的日期创建向量会收到POSIX错误

时间:2017-04-22 01:42:02

标签: r date

我正在引用另一篇文章,似乎提供了我正在寻找的确切解决方案:

Creating new column based on earliest date value in other column in R

以下是我的示例数据:

structure(list(ID = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label = c("a1", "b1"), class = "factor"), Begin = structure(list(sec = c(0, 0, 0, 0, 0, 0), min = c(0L, 0L, 0L, 0L, 0L, 0L), hour = c(0L, 0L, 0L, 0L, 0L, 0L), mday = c(28L, 4L, 10L, 10L, 12L, 13L), mon = c(11L, 11L, 11L, 11L, 11L, 11L), year = c(115L, 115L,115L, 115L, 115L, 115L), wday = c(1L, 5L, 4L, 4L, 6L, 0L), yday = c(361L, 337L, 343L, 343L, 345L, 346L), isdst = c(0L, 0L, 0L, 0L, 0L, 0L), zone = c("PST", "PST", "PST", "PST", "PST", "PST"), gmtoff = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_)), .Names = c("sec", "min", "hour", "mday", "mon", "year", "wday", "yday", "isdst", "zone", "gmtoff"), class = c("POSIXlt", "POSIXt"))), .Names = c("ID", "Begin"), row.names = c(NA, -6L), class = "data.frame")

以下是我要找的内容:

structure(list(ID = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label = c("a1", "b1"), class = "factor"), Begin = structure(list(sec = c(0, 0, 0, 0, 0, 0), min = c(0L, 0L, 0L, 0L, 0L, 0L), hour = c(0L, 0L, 0L, 0L, 0L, 0L), mday = c(28L, 4L, 10L, 10L, 12L, 13L), mon = c(11L, 11L, 11L, 11L, 11L, 11L), year = c(115L, 115L, 115L, 115L, 115L, 115L), wday = c(1L, 5L, 4L, 4L, 6L, 0L), yday = c(361L, 337L, 343L, 343L, 345L, 346L), isdst = c(0L, 0L, 0L, 0L, 0L, 0L), zone = c("PST", "PST", "PST", "PST", "PST", "PST"), gmtoff = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_)), .Names = c("sec", "min", "hour", "mday", "mon", "year", "wday", "yday", "isdst", "zone", "gmtoff"), class = c("POSIXlt", "POSIXt")), BeginE = structure(list(
sec = c(0, 0, 0, 0, 0, 0), min = c(0L, 0L, 0L, 0L, 0L, 0L
), hour = c(0L, 0L, 0L, 0L, 0L, 0L), mday = c(4L, 4L, 4L, 
10L, 10L, 10L), mon = c(11L, 11L, 11L, 11L, 11L, 11L), year = c(115L, 
115L, 115L, 115L, 115L, 115L), wday = c(5L, 5L, 5L, 4L, 4L, 
4L), yday = c(337L, 337L, 337L, 343L, 343L, 343L), isdst = c(0L, 
0L, 0L, 0L, 0L, 0L), zone = c("PST", "PST", "PST", "PST", 
"PST", "PST"), gmtoff = c(NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_)), .Names = c("sec", "min", "hour", "mday", "mon", "year", "wday", "yday", "isdst", "zone", "gmtoff"), class = c("POSIXlt", "POSIXt"))), .Names = c("ID", "Begin", "BeginE"), row.names = c(NA, -6L), class = "data.frame")   

为了回应关于提供所有代码的好评,我尝试了以下内容:

df2 <- as.data.frame(data.table(df)[, BeginE:= min(Begin), by = ID])

这是错误:

`Error in as.POSIXct.POSIXlt(X[[i]], ...) : invalid 'x' argument`

1 个答案:

答案 0 :(得分:1)

我通过简单的转换解决了这个问题:

df$Begin<-as.POSIXct(df$Begin)

也适用于我庞大的数据集。