如何使用“包含”和“ ifelse”有条件地突变多列?

时间:2019-07-15 16:25:57

标签: r function tidyverse

我想对包含字符串“ account”的多列进行突变。具体来说,我希望这些列在满足特定条件时取“ NA”,而在不满足条件时取另一个值。下面,我介绍了受herehere启发的尝试。到目前为止,未成功。仍在尝试,但是任何帮助将不胜感激。

我的数据

df<-as.data.frame(structure(list(low_account = c(1, 1, 0.5, 0.5, 0.5, 0.5), high_account = c(16, 
16, 56, 56, 56, 56), mid_account_0 = c(8.5, 8.5, 28.25, 28.25, 
28.25, 28.25), mean_account_0 = c(31.174, 30.1922101449275, 30.1922101449275, 
33.3055555555556, 31.174, 33.3055555555556), median_account_0 = c(2.1, 
3.8, 24.2, 24.2, 24.2, 24.2), low_account.1 = c(1, 1, 0.5, 0.5, 0.5, 
0.5), high_account.1 = c(16, 16, 56, 56, 56, 56), row.names = c("A001", "A002", "A003", "A004", "A005", "A006"))))

df
  low_account high_account mid_account_0 mean_account_0 median_account_0 low_account.1 high_account.1 row.names
1         1.0           16          8.50       31.17400              2.1           1.0             16      A001
2         1.0           16          8.50       30.19221              3.8           1.0             16      A002
3         0.5           56         28.25       30.19221             24.2           0.5             56      A003
4         0.5           56         28.25       33.30556             24.2           0.5             56      A004
5         0.5           56         28.25       31.17400             24.2           0.5             56      A005
6         0.5           56         28.25       33.30556             24.2           0.5             56      A006

我的尝试

sample_data<-df%>% mutate_at(select(contains("account") , ifelse(. <= df$low_account&  >= df$high_account, NA, .)))
  

错误:未注册tidyselect变量       致电rlang::last_error()查看回溯

预期产量

df
    low_account high_account mid_account_0 mean_account_0 median_account_0 low_account.1 high_account.1 row.names
    1         1.0           16          8.50       NA                    2.1           1.0             16      A001
    2         1.0           16          8.50       NA                    3.8           1.0             16      A002
    3         0.5           56         28.25       30.19221             24.2           0.5             56      A003
    4         0.5           56         28.25       33.30556             24.2           0.5             56      A004
    5         0.5           56         28.25       31.17400             24.2           0.5             56      A005
    6         0.5           56         28.25       33.30556             24.2           0.5             56      A006

1 个答案:

答案 0 :(得分:2)

vars(contains('account'))的问题在于,它与存在子字符串“ account”的所有列匹配,并且当我们进行逻辑比较时,“ low_account”列会转换为NA,因为它绝对小于或等于'low_account',因此仅NA替换列可用。因此,我们可以获取感兴趣的“ mid”,“ median”,“ mean”列,然后进行replace

library(tidyverse)
df %>% 
   mutate_at(vars(matches("(mid|mean|median)_account")),
           ~ replace(., .<= low_account | .>= high_account, NA))
# low_account high_account mid_account_0 mean_account_0 median_account_0 low_account.1 high_account.1 row.names
#1         1.0           16          8.50             NA              2.1           1.0             16      A001
#2         1.0           16          8.50             NA              3.8           1.0             16      A002
#3         0.5           56         28.25       30.19221             24.2           0.5             56      A003
#4         0.5           56         28.25       33.30556             24.2           0.5             56      A004
#5         0.5           56         28.25       31.17400             24.2           0.5             56      A005
#6         0.5           56         28.25       33.30556             24.2           0.5             56      A006