我有患者ID的df和他们是否经历过干预的二元指标。我想创建一个名为“time_post”的新列,它告诉我自经历干预以来已经过了多少个时间点。
这是我的DF:
names<-c("tom","tom","tom","tom","tom","tom","tom","tom", "john", "john","john", "john","john", "john","john", "john")
post<-as.numeric(0,0,0,1,1,1,1,1,0,1,1,1,1,1,1,1)
df<-data.frame(names,post)
这就是我的尝试:
df$time_post<-ifelse(df$post==1[1],1,0) ##this tries to assign 1 to "time_post" for first value of 1 seen in post
df$time_post<-ifelse(df$post==1[2],2,df$time_post) ##trying to apply same logic here, but doesn't work. Introduces NAs into time_post column.
这是我想要的输出;
names post time_post
1 tom 0 0
2 tom 0 0
3 tom 0 0
4 tom 1 1
5 tom 1 2
6 tom 1 3
7 tom 1 4
8 tom 1 5
9 john 0 0
10 john 1 1
11 john 1 2
12 john 1 3
13 john 1 4
14 john 1 5
15 john 1 6
16 john 1 7
提前谢谢
答案 0 :(得分:2)
试试这个:
df<-data.frame(names=c("tom","tom","tom","tom","tom","tom","tom","tom",
"john", "john","john", "john","john", "john","john", "john"),
post=c(0,0,0,1,1,1,1,1,0,1,1,1,1,1,1,1))
df$time_post <- with(df, ave(post,names,FUN=cumsum))
这给了你:
> df
names post time_post
1 tom 0 0
2 tom 0 0
3 tom 0 0
4 tom 1 1
5 tom 1 2
6 tom 1 3
7 tom 1 4
8 tom 1 5
9 john 0 0
10 john 1 1
11 john 1 2
12 john 1 3
13 john 1 4
14 john 1 5
15 john 1 6
16 john 1 7