我有多个日期的数据集。我希望将sessionFactory
的值滞后一个月。我可能无法使用Cells
因为每个月都有不同的天数(更不用说还有一些缺失的日期)。
我所做的是创建一个新的数据表,其中包含唯一的shift()
和Year
,移位/滞后Month
,然后将其与原始数据表合并(注意不要有重复的列。)
显然,效率不高。还有其他办法吗?
Cells
答案 0 :(得分:1)
# Replace your sapply usage with pacman and you'll thank me
# pacman installs if needed, loads, and doesn't require quotation marks
pacman::p_load(data.table, lubridate)
DT <- fread('DATE, ID, Cells
2000-01-01, 1, 10
2000-01-02, 1, 10
2000-01-03, 1, 10
2000-01-01, 2, 20
2000-01-02, 2, 20
2000-01-03, 2, 20
2000-01-04, 2, 20
2000-02-01, 1, 30
2000-02-02, 1, 30
2000-02-01, 2, 40
2000-02-03, 2, 40
2000-02-04, 2, 40
2000-03-01, 1, 50
2000-03-02, 1, 50
2000-03-01, 2, 60
2000-03-03, 2, 60
')
DT$date <- ymd(DT$DATE)
DT$month <- format((DT$date), "%b")
lag.cells <- as.vector(capture.output(cat(rep("NA", length(DT$month[DT$month == "Jan"])), DT$Cells)))
lag.cells <- strsplit(lag.cells, "\\s+")[[1]]
lag.cells <- lag.cells[1:nrow(DT)]
DT$lag.cells <- lag.cells
DT
DATE ID Cells date month lag.cells
1: 2000-01-01 1 10 2000-01-01 Jan NA
2: 2000-01-02 1 10 2000-01-02 Jan NA
3: 2000-01-03 1 10 2000-01-03 Jan NA
4: 2000-01-01 2 20 2000-01-01 Jan NA
5: 2000-01-02 2 20 2000-01-02 Jan NA
6: 2000-01-03 2 20 2000-01-03 Jan NA
7: 2000-01-04 2 20 2000-01-04 Jan NA
8: 2000-02-01 1 30 2000-02-01 Feb 10
9: 2000-02-02 1 30 2000-02-02 Feb 10
10: 2000-02-01 2 40 2000-02-01 Feb 10
11: 2000-02-03 2 40 2000-02-03 Feb 20
12: 2000-02-04 2 40 2000-02-04 Feb 20
13: 2000-03-01 1 50 2000-03-01 Mar 20
14: 2000-03-02 1 50 2000-03-02 Mar 20
15: 2000-03-01 2 60 2000-03-01 Mar 30
16: 2000-03-03 2 60 2000-03-03 Mar 30
答案 1 :(得分:0)
Date
班级seq
,"month"
,"quarter"
等支持"year"
。
不是那么优雅,但你可以做这样的事情。
library(magrittr)
DT[, DATE := as.Date(DATE)]
DT[, DATE_lag := sapply(DATE, function(x)
seq(x, by = "1 month", length.out = 2)[2]) %>%
as.Date(origin = "1970-01-01")]
DT2 <- DT[, .(DATE_lag, ID, Cells)]
setnames(DT2, c("DATE_lag", "Cells"), c("DATE", "Lagged.Cells"))
merge(DT, DT2, by = c("DATE", "ID"), all.x = TRUE)
DATE ID Cells date month lag.cells DATE_lag Lagged.Cells
1: 2000-01-01 1 10 2000-01-01 Jan NA 2000-02-01 NA
2: 2000-01-01 2 20 2000-01-01 Jan NA 2000-02-01 NA
3: 2000-01-02 1 10 2000-01-02 Jan NA 2000-02-02 NA
4: 2000-01-02 2 20 2000-01-02 Jan NA 2000-02-02 NA
5: 2000-01-03 1 10 2000-01-03 Jan NA 2000-02-03 NA
6: 2000-01-03 2 20 2000-01-03 Jan NA 2000-02-03 NA
7: 2000-01-04 2 20 2000-01-04 Jan NA 2000-02-04 NA
8: 2000-02-01 1 30 2000-02-01 Feb 10 2000-03-01 10
9: 2000-02-01 2 40 2000-02-01 Feb 10 2000-03-01 20
10: 2000-02-02 1 30 2000-02-02 Feb 10 2000-03-02 10
11: 2000-02-03 2 40 2000-02-03 Feb 20 2000-03-03 20
12: 2000-02-04 2 40 2000-02-04 Feb 20 2000-03-04 20
13: 2000-03-01 1 50 2000-03-01 Mar 20 2000-04-01 30
14: 2000-03-01 2 60 2000-03-01 Mar 30 2000-04-01 40
15: 2000-03-02 1 50 2000-03-02 Mar 20 2000-04-02 30
16: 2000-03-03 2 60 2000-03-03 Mar 30 2000-04-03 40
>