根据其他列的实际值和先前值进行值替换

时间:2018-06-21 17:31:18

标签: r performance for-loop replace

当每个ID的当前“ obs1”列为1,而先前的“ obs1”列为0(省略循环)时,是否有办法设置为“结果”列中的值?

输入数据

df <- data.frame(ID = c(1,1,1,1,1,1,1,1,1,1, 2, 2),
     obs1 = c(1,1,1,1,1,1,0,0,1,1,1,1),
     obs2 = c(1,1,1,0,0,0,1,1,1,0,0,1),
     result1 = c(0,28,63,84,105,135,150,150,150,59, 0,300),
     result2 = c(0,28,63,63,63,63,63,31,59,59,0,0))

所需的输出:

df <- data.frame(ID = c(1,1,1,1,1,1,1,1,1,1,2,2),
     obs1 = c(1,1,1,1,1,1,0,0,1,1,1,1),
     obs2 = c(1,1,1,0,0,0,1,1,1,0,0,1),
     result1 = c(0,28,63,84,105,135,150,150,0,59,0,300),
     result2 = c(0,28,63,63,63,63,0,31,59,59,0,0))

第6行“ result2”列和第9行“ result1”列发生变化

1 个答案:

答案 0 :(得分:1)

带有dplyr的选项可以是:

library(dplyr)
df %>% group_by(ID) %>%
  mutate(result1 = ifelse(obs1==1 & lag(obs1, default = 1) == 0, 0, result1)) %>%
  mutate(result2 = ifelse(obs2==1 & lag(obs2, default = 1) == 0, 0, result2)) %>%
  as.data.frame()

使用mutate_at可以实现通用解决方案:

df %>% group_by(ID) %>%
  mutate_at(vars(starts_with("result")), 
           funs(ifelse( get(sub("result","obs",quo_name(quo(.))))==1 &
                  lag(get(sub("result","obs",quo_name(quo(.)))),
                                             default = 1) ==0  ,0,.)
                                              )) %>%
  as.data.frame()

#    ID obs1 obs2 result1 result2
# 1   1    1    1       0       0
# 2   1    1    1      28      28
# 3   1    1    1      63      63
# 4   1    1    0      84      63
# 5   1    1    0     105      63
# 6   1    1    0     135      63
# 7   1    0    1     150       0
# 8   1    0    1     150      31
# 9   1    1    1       0      59
# 10  1    1    0      59      59
# 11  2    1    0       0       0
# 12  2    1    1     300       0