我有一些看起来像这样的数据:
id time
1 2013-02-04 02:20:59
1 2013-02-04 02:21:05
1 2013-02-04 02:21:24
2 2013-02-04 02:21:26
2 2013-02-04 02:22:19
2 2013-02-04 02:22:35
我想对两个时间值之间的每个id进行时间差异,例如:
id 1 02:21:05-02:20:59=00:00:06.
我怎样才能在R?
中这样做答案 0 :(得分:3)
您应该按时diff
id
,然后使用ifelse
填充第三列
df <- structure(list(id = c(1L, 1L, 1L, 2L, 2L, 2L),
time = structure(c(1359915659, 1359915665, 1359915684,
1359915686, 1359915739, 1359915755), class = c("POSIXct",
"POSIXt"), tzone = "")), .Names = c("id", "time"), row.names = c(NA, -6L),
class = "data.frame")
df
## id time
## 1 1 2013-02-04 02:20:59
## 2 1 2013-02-04 02:21:05
## 3 1 2013-02-04 02:21:24
## 4 2 2013-02-04 02:21:26
## 5 2 2013-02-04 02:22:19
## 6 2 2013-02-04 02:22:35
## here you are checking if that result is diff in time only when diff in id is 0
df$result <- c(0, ifelse(diff(df$id) == 0, diff(df$time), 0))
df
## id time result
## 1 1 2013-02-04 02:20:59 0
## 2 1 2013-02-04 02:21:05 6
## 3 1 2013-02-04 02:21:24 19
## 4 2 2013-02-04 02:21:26 0
## 5 2 2013-02-04 02:22:19 53
## 6 2 2013-02-04 02:22:35 16
答案 1 :(得分:1)
此处使用by
和transform
transform(dat, res = unlist(by(time,as.factor(id),
FUN=function(x)c(0,diff(x)))))
这适用于因子id,它是分组列的自然类型。
答案 2 :(得分:0)
我认为这会奏效,但我对你的要求并不完全清楚......
df <- data.frame(id = rep(c(1,2), each=3), time=seq(from = as.POSIXct("2013-02-04 02:20:59"), to=as.POSIXct("2013-02-04 02:22:35"),length.out=6))
library(plyr)
df.diff <- ddply(df, .(id), summarise,
difference = diff(as.numeric(time)))
df.diff
# id diff
# 1 1 19.2
# 2 1 19.2
# 3 2 19.2
# 4 2 19.2