计算R中不同面试日期的经过时间

时间:2014-04-02 22:23:42

标签: r reshape date-arithmetic

所以我的数据看起来像这样

dat<-data.frame(
subjid=c("a","a","a","b","b","c","c","d","e"),
type=c("baseline","first","second","baseline","first","baseline","first","baseline","baseline"),
date=c("2013-02-07","2013-02-27","2013-04-30","2013-03-03","2013-05-23","2013-01-02","2013-07-23","2013-03-29","2013-06-03"))

即)

  subjid     type       date
1      a baseline 2013-02-07
2      a    first 2013-02-27
3      a   second 2013-04-30
4      b baseline 2013-03-03
5      b    first 2013-05-23
6      c baseline 2013-01-02
7      c    first 2013-07-23
8      d baseline 2013-03-29
9      e baseline 2013-06-03

我试图制作变量&#34;经过时间&#34;表示从基线日期到第一轮和第二轮面试日期所经过的时间(因此基线的经过时间= 0)。请注意,无论是否进行了进一步的采访,它都会有所不同。

我试图重塑数据,以便我可以减去每个日期,但我的大脑今天没有真正起作用 - 还是有另一种方式?

请帮助,谢谢。

2 个答案:

答案 0 :(得分:1)

ave

喊叫

我会在那里抛出NA值,只是为了好的措施:

dat<-data.frame(
subjid=c("a","a","a","b","b","c","c","d","e"),
type=c("baseline","first","second","baseline","first","baseline","first","baseline","baseline"),
date=c("2013-02-07","NA","2013-04-30","2013-03-03","2013-05-23","2013-01-02","2013-07-23","2013-03-29","2013-06-03"))

您应该将数据排序为安全的一面:

dat$type <- ordered(dat$type,levels=c("baseline","first","second","third") )
dat <- dat[order(dat$subjid,dat$type),]

将您的日期转换为正确的Date对象:

dat$date <- as.Date(dat$date)

然后计算差异:

dat$elapsed <- ave(as.numeric(dat$date),dat$subjid,FUN=function(x) x-x[1] )

#  subjid     type       date  elapsed
#1      a baseline 2013-02-07        0
#2      a    first       <NA>       NA
#3      a   second 2013-04-30       82
#4      b baseline 2013-03-03        0
#5      b    first 2013-05-23       81
#6      c baseline 2013-01-02        0
#7      c    first 2013-07-23      202
#8      d baseline 2013-03-29        0
#9      e baseline 2013-06-03        0

答案 1 :(得分:1)

这并不假设baseline总是处于1位置:

dat$date <- as.Date(dat$date)
dat$elapesed <- unlist(by(dat, dat$subjid, FUN=function(x) {
  as.numeric(x$date - x[x$type=="baseline",]$date)
}))