我需要根据时间顺序对象(此处为日期)创建一个值序列(在下面的数据框中命名为" seq")。要构建新序列,两个日期之间的时间间隔必须严格大于1小时。
这是一个例子
ID date seq
A 2010-04-14 02:00:12 1
A 2010-04-14 02:00:12 1
A 2010-04-14 03:00:10 1
A 2010-04-14 03:00:10 1
A 2010-04-14 04:00:15 1
A 2010-04-14 04:00:15 1
A 2010-04-14 08:00:10 2
A 2010-04-14 08:00:10 2
B 2010-04-14 03:00:18 3
B 2010-04-14 03:00:18 3
B 2010-04-14 04:00:10 3
B 2010-04-14 04:00:10 3
B 2010-04-14 10:00:14 4
B 2010-04-14 10:00:14 4
B 2010-04-14 11:00:10 4
B 2010-04-14 11:00:10 4
数据
tab <- data.frame(ID= rep(c("A","B"), each=8), date= as.POSIXct( c('2010-04-14 02:00:12','2010-04-14 02:00:12','2010-04-14 03:00:10', '2010-04-14 03:00:10','2010-04-14 04:00:15','2010-04-14 04:00:15','2010-04-14 08:00:10','2010-04-14 08:00:10','2010-04-14 03:00:18','2010-04-14 03:00:18','2010-04-14 04:00:10','2010-04-14 04:00:10','2010-04-14 10:00:14','2010-04-14 10:00:14','2010-04-14 11:00:10','2010-04-14 11:00:10'), format='%Y-%m-%d %H:%M:%S'))
答案 0 :(得分:1)
这行代码应该用于此目的:
tab$seq <- floor(as.numeric(tab$date-min(tab$date))/3600)
答案 1 :(得分:1)
您所需的输出似乎不正确,因为“2010-04-14 03:00:10”和“2010-04-14 04:00:15”之间有1小时的差异,但您的顺序不会增加。您的示例中还不清楚在ID
更改时序列是否应该递增。
假设seq
应在“2010-04-14 03:00:10”和“2010-04-14 04:00:15”之间递增,ID
中的值不应影响seq
,这是一个解决方案:
tab$seq <- c(0, cumsum(abs(diff(tab$date)) > 3600)) + 1