我一直在使用基数R,但是我想使用dplyr
。这就是我一直在做的:
data$newvariable <- 0
data$newvariable[data$oldvariable=="happy"] <- "good"
data$newvariable[data$oldvariable=="unhappy"] <- "bad"
data$newvariable[data$oldvariable=="depressed"] <- "super_bad"
答案 0 :(得分:1)
如果oldvariable是一个因素,并且您不介意newvariable是其中一个:
library(dplyr)
set.seed(111)
data = data.frame(
oldvariable=sample(c("happy","unhappy","depressed"),10,replace=TRUE))
data %>% mutate(newvariable=recode_factor(oldvariable,
"happy"="good","unhappy"="bad","depressed"="super_bad"))
oldvariable newvariable
1 unhappy bad
2 depressed super_bad
3 depressed super_bad
4 depressed super_bad
5 happy good
6 depressed super_bad
7 happy good
8 depressed super_bad
9 unhappy bad
10 happy good
答案 1 :(得分:0)
在dplyr
中,我们可以使用case_when
根据newvariable
向oldvariable
分配新值。
library(dplyr)
data = data.frame(
oldvariable = c("happy", "unhappy", "depressed")
)
data %>%
mutate(newvariable = case_when(
oldvariable == "happy" ~ "good",
oldvariable == "unhappy" ~ "bad",
oldvariable == "depressed" ~ "super_bad"
))
#> oldvariable newvariable
#> 1 happy good
#> 2 unhappy bad
#> 3 depressed super_bad