如何使用mutate_all并使用dplyr正确重新编码?

时间:2017-02-22 15:20:40

标签: r dplyr recode

我一直在尝试使用recode的dplyr变体,并在数据集中的所有变量上结合mutate_all,但它不会产生预期的输出。我找到的其他答案并未解决此问题(例如Recode and Mutate_all in dplyr

以下是我的尝试:

library(tidyverse)
library(car)

# Create sample data
df <- data_frame(a = c("Yes","Maybe","No","Yes"), b = c("No","Maybe","Yes","Yes"))

# Using dplyr::recode
df %>% mutate_all(funs(recode(., `1` = "Yes", `0` = "No", `NA` = "Maybe")))

对价值没有影响:

# A tibble: 4 × 2
      a     b
  <chr> <chr>
1   Yes    No
2 Maybe Maybe
3    No   Yes
4   Yes   Yes

我想要的东西可以用car :: Recode:

再现
# Using car::Recode
df %>% mutate_all(funs(Recode(., "'Yes' = 1; 'No' = 0; 'Maybe' = NA")))

这是理想的结果:

# A tibble: 4 × 2
      a     b
  <dbl> <dbl>
1     1     0
2    NA    NA
3     0     1
4     1     1

1 个答案:

答案 0 :(得分:5)

您反转dplyr::recode中的'键/值'。这对我有用:

df %>% mutate_all(funs(recode(., Yes = 1L, No = 0L, Maybe = NA_integer_)))

# A tibble: 4 × 2
      a     b
  <dbl> <dbl>
1     1     0
2    NA    NA
3     0     1
4     1     1

请注意,如果您未指定NA的类型,则会引发错误。

您也可以使用引用或不引用的值(例如:Yes'Yes'工作)