嵌套的if_else()和is.na()逻辑不一致?

时间:2019-02-06 11:55:16

标签: r dplyr

我正在尝试使用mutateif_else()来将以下逻辑语句的结果应用于数据帧的两列:

如果a或b中为是,则为真,如果同时为NA或NA,则为NA,否则为FALSE

library(magrittr)
library(dplyr)

data.frame(
    "a"=c(NA,"No","Yes","Yes","No","No",NA),
    "b"=c(NA,"No","Yes","No","Yes",NA,"No")
) %>% 
mutate(
    logical = if_else(
        a == "Yes" | b == "Yes",
        TRUE,
        if_else(
            is.na(a) & is.na(b),
            NA,
            FALSE
        )
    )
)
#>      a    b logical
#> 1 <NA> <NA>      NA
#> 2   No   No   FALSE
#> 3  Yes  Yes    TRUE
#> 4  Yes   No    TRUE
#> 5   No  Yes    TRUE
#> 6   No <NA>      NA
#> 7 <NA>   No      NA

在最后两行中,我得到NA而不是预期结果FALSE。可以预期,因为is.na(a) & is.na(b)应该返回FALSE,如下面的示例所示。

# False as expected here
if_else(is.na(NA) & is.na("No"),NA,FALSE)
#> [1] FALSE

我是否错过了if_else的工作方式?

reprex package(v0.2.1)于2019-02-06创建

2 个答案:

答案 0 :(得分:3)

您也可以这样做:

library(dplyr)

data.frame(
  "a"=c(NA,"No","Yes","Yes","No","No",NA),
  "b"=c(NA,"No","Yes","No","Yes",NA,"No")
) %>%
  mutate(
    logical = case_when(
      a == "Yes" | b == "Yes" ~ TRUE,
      is.na(a) & is.na(b) ~ NA,
      TRUE ~ FALSE
    )
  )

输出:

     a    b logical
1 <NA> <NA>      NA
2   No   No   FALSE
3  Yes  Yes    TRUE
4  Yes   No    TRUE
5   No  Yes    TRUE
6   No <NA>   FALSE
7 <NA>   No   FALSE

答案 1 :(得分:0)

我们需要在第一个if_else中添加条件以处理NA元素,否则,与NA元素进行比较将返回NA

df1 %>% 
   mutate(logical = if_else((a == "Yes" & !is.na(a)) |
            (b == "Yes" & !is.na(b)), TRUE, 
      if_else(is.na(a) & is.na(b), NA, FALSE )))
#     a    b logical
#1 <NA> <NA>      NA
#2   No   No   FALSE
#3  Yes  Yes    TRUE
#4  Yes   No    TRUE
#5   No  Yes    TRUE
#6   No <NA>   FALSE
#7 <NA>   No   FALSE

注意:在这里,我们正在尝试解决OP的根本问题


此外,我们可以将==替换为%in%,而NA的问题将得到解决

df1 %>%
   mutate(logical = if_else(a %in% "Yes" | b %in% "Yes", TRUE, 
                    if_else(is.na(a) & is.na(b), NA, FALSE)))

或使用base R

replace((rowSums(df1 == "Yes", na.rm = TRUE) > 0), rowSums(is.na(df1) == 2, NA)
#[1]    NA FALSE  TRUE  TRUE  TRUE FALSE FALSE

数据

df1 <- data.frame(
 "a"=c(NA,"No","Yes","Yes","No","No",NA),
 "b"=c(NA,"No","Yes","No","Yes",NA,"No")
   )