dplyr根据多个条件替换列中的na值

时间:2018-05-20 15:12:39

标签: r dplyr

我在Occupation列中包含两个NA值的数据,我尝试使用dplyr将值替换为Pensioner一词。

这就是我所拥有的。

data <- data %>% 
  filter(is.na(Occupation) & Yrs_Empleo <= -999 & Organisation == "XNA" & Income_type == "Pensioner")

我尝试了mutate_atreplace_na以及一些ifelse语句,但我似乎无法弄清楚如何正确地执行此操作。

所以基本上我试图根据三个条件替换列NA中的所有Occupation值,然后在满足这三个条件后,替换为工作retired

structure(list(Yrs_Empleo = c(1.74520547945205, 3.25479452054795, 
0.616438356164384, 8.32602739726027, 8.32328767123288, 4.35068493150685, 
8.57534246575342, 1.23013698630137, -1000.66575342466, 5.53150684931507, 
1.86027397260274, -1000.66575342466, 7.44383561643836), Occupation = c("Laborers", 
"Core staff", "Laborers", "Laborers", "Core staff", "Laborers", 
"Accountants", "Managers", NA, "Laborers", "Core staff", NA, 
"Laborers"), Organisation = c("Business Entity Type 3", "School", 
"Government", "Business Entity Type 3", "Religion", "Other", 
"Business Entity Type 3", "Other", "XNA", "Electricity", "Medicine", 
"XNA", "Business Entity Type 2"), Income_type = c("Working", 
"State servant", "Working", "Working", "Working", "State servant", 
"Commercial associate", "State servant", "Pensioner", "Working", 
"Working", "Pensioner", "Working")), .Names = c("Yrs_Empleo", 
"Occupation", "Organisation", "Income_type"), row.names = c(NA, 
13L), class = "data.frame")

2 个答案:

答案 0 :(得分:2)

我们可以使用if_else

data %>%
  mutate(Occupation = if_else(is.na(Occupation) & 
                         Yrs_Empleo <= -999 &
                         Organisation == "XNA", "Pensioner", Occupation))
#    Yrs_Empleo  Occupation           Organisation          Income_type
#1      1.7452055    Laborers Business Entity Type 3              Working
#2      3.2547945  Core staff                 School        State servant
#3      0.6164384    Laborers             Government              Working
#4      8.3260274    Laborers Business Entity Type 3              Working
#5      8.3232877  Core staff               Religion              Working
#6      4.3506849    Laborers                  Other        State servant
#7      8.5753425 Accountants Business Entity Type 3 Commercial associate
#8      1.2301370    Managers                  Other        State servant
#9  -1000.6657534   Pensioner                    XNA            Pensioner
#10     5.5315068    Laborers            Electricity              Working
#11     1.8602740  Core staff               Medicine              Working
#12 -1000.6657534   Pensioner                    XNA            Pensioner
#13     7.4438356    Laborers Business Entity Type 2              Working

或使用replace

data %>% 
   mutate(Occupation = replace(Occupation, 
             is.na(Occupation) & 
                         Yrs_Empleo <= -999 &
                         Organisation == "XNA",
               "Pensioner"))

答案 1 :(得分:1)

您可以像这样使用case_when

data %>% 
  mutate(Occupation = case_when(is.na(Occupation) & Yrs_Empleo <= -999 & Organisation == "XNA" & Income_type == "Pensioner" ~ "retired",
                                TRUE ~ Occupation))

      Yrs_Empleo  Occupation           Organisation          Income_type
1      1.7452055    Laborers Business Entity Type 3              Working
2      3.2547945  Core staff                 School        State servant
3      0.6164384    Laborers             Government              Working
4      8.3260274    Laborers Business Entity Type 3              Working
5      8.3232877  Core staff               Religion              Working
6      4.3506849    Laborers                  Other        State servant
7      8.5753425 Accountants Business Entity Type 3 Commercial associate
8      1.2301370    Managers                  Other        State servant
9  -1000.6657534     retired                    XNA            Pensioner
10     5.5315068    Laborers            Electricity              Working
11     1.8602740  Core staff               Medicine              Working
12 -1000.6657534     retired                    XNA            Pensioner
13     7.4438356    Laborers Business Entity Type 2              Working