所以我的数据看起来像这样
dat<-data.frame(
subjid=c("a","a","a","b","b","c","c","d","e"),
type=c("baseline","first","second","baseline","first","baseline","first","baseline","baseline"),
date=c("2013-02-07","2013-02-27","2013-04-30","2013-03-03","2013-05-23","2013-01-02","2013-07-23","2013-03-29","2013-06-03"))
即)
subjid type date
1 a baseline 2013-02-07
2 a first 2013-02-27
3 a second 2013-04-30
4 b baseline 2013-03-03
5 b first 2013-05-23
6 c baseline 2013-01-02
7 c first 2013-07-23
8 d baseline 2013-03-29
9 e baseline 2013-06-03
我试图制作变量&#34;经过时间&#34;表示从基线日期到第一轮和第二轮面试日期所经过的时间(因此基线的经过时间= 0)。请注意,无论是否进行了进一步的采访,它都会有所不同。
我试图重塑数据,以便我可以减去每个日期,但我的大脑今天没有真正起作用 - 还是有另一种方式?
请帮助,谢谢。
答案 0 :(得分:1)
为ave
:
我会在那里抛出NA
值,只是为了好的措施:
dat<-data.frame(
subjid=c("a","a","a","b","b","c","c","d","e"),
type=c("baseline","first","second","baseline","first","baseline","first","baseline","baseline"),
date=c("2013-02-07","NA","2013-04-30","2013-03-03","2013-05-23","2013-01-02","2013-07-23","2013-03-29","2013-06-03"))
您应该将数据排序为安全的一面:
dat$type <- ordered(dat$type,levels=c("baseline","first","second","third") )
dat <- dat[order(dat$subjid,dat$type),]
将您的日期转换为正确的Date
对象:
dat$date <- as.Date(dat$date)
然后计算差异:
dat$elapsed <- ave(as.numeric(dat$date),dat$subjid,FUN=function(x) x-x[1] )
# subjid type date elapsed
#1 a baseline 2013-02-07 0
#2 a first <NA> NA
#3 a second 2013-04-30 82
#4 b baseline 2013-03-03 0
#5 b first 2013-05-23 81
#6 c baseline 2013-01-02 0
#7 c first 2013-07-23 202
#8 d baseline 2013-03-29 0
#9 e baseline 2013-06-03 0
答案 1 :(得分:1)
这并不假设baseline
总是处于1
位置:
dat$date <- as.Date(dat$date)
dat$elapesed <- unlist(by(dat, dat$subjid, FUN=function(x) {
as.numeric(x$date - x[x$type=="baseline",]$date)
}))