Question

我对R很陌生，并尝试清理数据。我正在使用case_when将是，否和未知分配给变量。我想在第一个语句中分配相同的变量No和Unknown，如果它在第一个语句中被分配为Yes，而其他语句为true或false。

这就是我所拥有的：

    ID col1  col2  
    1   Ball  a  
    2   NA    c   
    3   Bat   b

这是我要实现的目标：

    ID col1  col2  x
    1   Ball  a   No
    2   NA    c   Yes
    3   Bat   b   Unknown

 mutate(x = case_when(
      is.na(col1) == TRUE ~ "Yes",
      !is.na(col1) == TRUE & (col2 %in% c("a", "b")|
      (col2 == "YES" & x == "Unknown" ) == TRUE ) ~ "No"),
TRUE ~ "Unknown"
))

基本上，我想使用first case_when中x的结果，并在第二行代码中使用它。如果col1为NA，我基本上希望我的列x为“是”。如果col1不丢失并且（col2％in％c（“ a”，“ b”）或col1 ==“ Bat”和x =“ Yes”），则设置x =“ No”

是否有一种方法可以使其正常工作。任何帮助表示赞赏。

Answer 1

https://dplyr.tidyverse.org/reference/case_when.html

case_when使您可以列出一系列测试，并分配与通过的第一个测试关联的值（即TRUE）。

在大多数情况下，只要仔细地按顺序进行测试，就可以得到想要的东西。在这个问题中，您的指令，注释和输出表似乎不一致，因此很难回答。在这里，我将您编辑的最后文本用作逻辑的基础：

library(dplyr)
df %>%
  mutate(x = case_when(
    # First, test if col1 is NA -- if so, x will be "Yes" and we are done with the case_when.
    is.na(col1)  ~ "Yes",

    # For the second test, I'll rely on the text of your latest edit:
    #   "And if col1 is not missing and (col2 %in% c("a", "b") or 
    #    col1 == "Bat" and x = "Yes") then set x = "No"
    # (Note, this doesn't seem to be consistent with your output table...)

    # To get here means the prior test was false: col1 must have a non-NA value.
    col2 %in% c("a", "b") | col1 == "Bat"  ~ "No",

    # Otherwise, set to unknown
    TRUE  ~ "Unknown"
  ))

  ID col1 col2   x
1  1 Ball    a  No
2  2 <NA>    c Yes
3  3  Bat    b  No

从第一行使用case_when的结果，并使用结果评估第二个条件

1 个答案: