Question

我有患者ID的df和他们是否经历过干预的二元指标。我想创建一个名为“time_post”的新列，它告诉我自经历干预以来已经过了多少个时间点。

这是我的DF：

names<-c("tom","tom","tom","tom","tom","tom","tom","tom", "john", "john","john", "john","john", "john","john", "john")
post<-as.numeric(0,0,0,1,1,1,1,1,0,1,1,1,1,1,1,1)
df<-data.frame(names,post)

这就是我的尝试：

df$time_post<-ifelse(df$post==1[1],1,0)   ##this tries to assign 1 to "time_post" for first value of 1 seen in post

df$time_post<-ifelse(df$post==1[2],2,df$time_post)  ##trying to apply same logic here, but doesn't work. Introduces NAs into time_post column.

这是我想要的输出;

names post time_post
1    tom    0         0
2    tom    0         0
3    tom    0         0
4    tom    1         1
5    tom    1         2
6    tom    1         3
7    tom    1         4
8    tom    1         5
9   john    0         0
10  john    1         1
11  john    1         2
12  john    1         3
13  john    1         4
14  john    1         5
15  john    1         6
16  john    1         7

提前谢谢

Answer 1

试试这个：

df<-data.frame(names=c("tom","tom","tom","tom","tom","tom","tom","tom",
                       "john", "john","john", "john","john", "john","john", "john"),
               post=c(0,0,0,1,1,1,1,1,0,1,1,1,1,1,1,1))
df$time_post <- with(df, ave(post,names,FUN=cumsum))

这给了你：

> df
   names post time_post
1    tom    0         0
2    tom    0         0
3    tom    0         0
4    tom    1         1
5    tom    1         2
6    tom    1         3
7    tom    1         4
8    tom    1         5
9   john    0         0
10  john    1         1
11  john    1         2
12  john    1         3
13  john    1         4
14  john    1         5
15  john    1         6
16  john    1         7

将col值分配给在不同列中观察到的第一，第二，第三值

1 个答案: