在多个条件下重新编码dplyr

时间:2018-04-02 23:17:56

标签: r dplyr recode

如果在dplyr中使用三个值之一,我想将变量重新编码为缺失。请考虑以下数据框have

id  married hrs_workperwk
1   1       40
2   1       55
3   1       70
4   0       -1
5   1       99
6   0       -2
7   0       10
8   0       40
9   1       45

-1,-2和99是非法值。新数据框want应如下所示:

id  married hrs_workperwk
1   1       40
2   1       55
3   1       70
4   0       NA
5   1       NA
6   0       NA
7   0       10
8   0       40
9   1       45

我可以使用base R来快速解决这个问题,但是当我已经使用mutate()时,dplyr通常很方便。唉,这意味着我目前使用多个嵌套的if_else()函数:

want <- mutate(have, 
hrs_workperwk = if_else(hrs_workperwk < 0, as.numeric(NA), 
                if_else(hrs_workperwk = 99, as.numeric(NA), hrs_workperwk)))

有没有办法只用一个if_else()函数来做到这一点?理想情况下是这样的:

want <- mutate(have, 
hrs_workperwk = if_else(hrs_workperwk = c(-2, -1, 99), as.numeric(NA), hrs_workperwk))

3 个答案:

答案 0 :(得分:2)

您可以使用%in%

want <- have %>% 
  mutate(hrs_workperwk = ifelse(hrs_workperwk %in% c(-1, -2, 99), NA, hrs_workperwk))

答案 1 :(得分:2)

我们可以使用replace

df %>%
  mutate(hrs_workperwk = replace(hrs_workperwk, hrs_workperwk %in% c(-1, -2, 99), NA))
#  id married hrs_workperwk
#1  1       1            40
#2  2       1            55
#3  3       1            70
#4  4       0            NA
#5  5       1            NA
#6  6       0            NA
#7  7       0            10
#8  8       0            40
#9  9       1            45

或另一个选项是case_when

df %>%
   mutate(hrs_workperwk = case_when(hrs_workperwk %in% c(-1, -2, 99)~ NA_integer_,
                      TRUE ~ hrs_workperwk))

答案 2 :(得分:1)

在基地R:

df1$hrs_workperwk[df1$hrs_workperwk %in% c(-1,-2,99)] <- NA

is.na(df1$hrs_workperwk) <- df1$hrs_workperwk %in% c(-1,-2,99)

两种情况的输出:

#   id married hrs_workperwk
# 1  1       1            40
# 2  2       1            55
# 3  3       1            70
# 4  4       0            NA
# 5  5       1            NA
# 6  6       0            NA
# 7  7       0            10
# 8  8       0            40
# 9  9       1            45

数据

df1 <- read.table(text="
id  married hrs_workperwk
1   1       40
2   1       55
3   1       70
4   0       -1
5   1       99
6   0       -2
7   0       10
8   0       40
9   1       45",h=T,strin=F)