R每隔一行获取值的差异

时间:2018-04-18 10:00:14

标签: r

我有一个这样的数据框:

   MemId                  ET
1      1 2017-10-01 09:10:56
2      2 2017-10-01 09:11:59
14     3 2017-10-01 13:01:36
15     4 2017-10-01 13:03:46
37     5 2017-10-01 14:59:04
38     6 2017-10-01 15:58:28

Dput版本:

structure(list(MemId = c(1, 2, 3, 4, 5,6), ET = structure(c(1506829256,
 1506829319, 1506843096,1506843226, 1506850144, 1506853708), class = 
c("POSIXct", "POSIXt"))), .Names = c("MemId", "ET"), row.names = c("1",
 "2", "14", "15", "37", "38"), class = "data.frame")

我希望每隔一行和前一行之间的时间差异。例如,我想要第2行和第1行之间的区别,类似地,在第15行和第14行之间。

第2行和第1行之间的时间差为63秒,类似地,第4行和第3行之间的差异为130秒。所需的输出如下:

MemId                  ET        diff
1      1 2017-10-01 09:10:56      0
2      2 2017-10-01 09:11:59     63
14     3 2017-10-01 13:01:36      0
15     4 2017-10-01 13:03:46     130
37     5 2017-10-01 14:59:04      0
38     6 2017-10-01 15:58:28     3564

我尝试了以下(时间是数据帧的名称):

time$diff <- with(time, {
 as.numeric(ET[c(FALSE, TRUE)]) - as.numeric(ET[c(TRUE,FALSE)])
})

这只是打印

   MemId                  ET diff
1      1 2017-10-01 09:10:56   63
2      2 2017-10-01 09:11:59  130
14     3 2017-10-01 13:01:36 3564
15     4 2017-10-01 13:03:46   63
37     5 2017-10-01 14:59:04  130
38     6 2017-10-01 15:58:28 3564

3 个答案:

答案 0 :(得分:3)

你很亲密:

df$diff <- 0
df$diff[c(FALSE, TRUE)] <- difftime(df$ET[c(FALSE, TRUE)], 
                                    df$ET[c(TRUE,FALSE)],
                                    units = "secs")
> df
#  MemId                  ET diff
#1     1 2017-10-01 09:10:56    0
#2     2 2017-10-01 09:11:59   63
#3     3 2017-10-01 13:01:36    0
#4     4 2017-10-01 13:03:46  130
#5     5 2017-10-01 14:59:04    0
#6     6 2017-10-01 15:58:28 3564

答案 1 :(得分:1)

这是使用lubridate的另一个版本:

dput(
  structure(
    list(MemId = c(1, 2, 3, 4, 5, 6), ET = structure(
      c(
        1506829256,
        1506829319,
        1506843096,
        1506843226,
        1506850144,
        1506853708
      ),
      class =
        c("POSIXct", "POSIXt")
    )),
    .Names = c("MemId", "ET"),
    row.names = c("1",
                  "2", "14", "15", "37", "38"), class = "data.frame"),"time")

time<-dget("time")

pkg <- c("lubridate")
new.pkg <- pkg[!(pkg %in% installed.packages())]
# install
if (length(new.pkg)) {
  install.packages(new.pkg, repos = "http://cran.rstudio.com")
}

library(lubridate)

time$diff <-
  sapply(1:nrow(time), function(x)
    ifelse(
      x%%2==0,interval(time$ET[x-1],time$ET[x]),0
    ))

time

   MemId                  ET diff
1      1 2017-09-30 22:40:56    0
2      2 2017-09-30 22:41:59   63
14     3 2017-10-01 02:31:36    0
15     4 2017-10-01 02:33:46  130
37     5 2017-10-01 04:29:04    0
38     6 2017-10-01 05:28:28 3564

答案 2 :(得分:1)

dplyr替代方法,创建一个虚拟列(id)。

library(dplyr)
df %>%
  mutate(id = rep_len(0:1, nrow(df))) %>%
  mutate(dif = ifelse(id == 1, difftime(ET, lag(ET), units = "secs"), NA))

<强>输出:

  MemId                  ET id  dif
1     1 2017-10-01 05:40:56  0   NA
2     2 2017-10-01 05:41:59  1   63
3     3 2017-10-01 09:31:36  0   NA
4     4 2017-10-01 09:33:46  1  130
5     5 2017-10-01 11:29:04  0   NA
6     6 2017-10-01 12:28:28  1 3564