使用条件变异分配多个值

时间:2018-05-09 04:40:51

标签: r dplyr

可以使用单个条件变异调用来为多个变量赋值吗?

例如,在下面的示例中,当cat == "a"时,我想将值“1”分配给列“foo”,将值“3”分配给列“bar”。同样,当cat == "b"时,指定“2”和“4”。

以下内容实现了这一点,但要求对每个变量重复调用case_when

require(tidyverse)
df <- tibble(cat = c("a", "b", "a", "a", "c"))
df %>%
  mutate(foo = case_when(cat == "a" ~ 1,
                         cat == "b" ~ 2,
                         TRUE ~ NA_real_)) %>%
  mutate(bar = case_when(cat == "a" ~ "three",
                         cat == "b" ~ "four",
                         TRUE ~ NA_character_))

我认为创建列表列可能很有用,类似于

df %>%
  mutate(case_when(cat == "a" ~ list("foo" = 1, "bar" = "three"),
                   cat == "b" ~ list("foo" = 2, "bar" = "four"),
                   TRUE ~ NA_real_))

case_when只接受RHS的单个值。

一个解决方案(例如here)是创建“参考”数据框,并join,例如

require(tidyverse)    
ref <- tibble(cat = c("a", "b"), foo = c(1, 2), bar = c("three", "four"))
df %>% left_join(ref)

然而,当'条件'不是绝对的时,这将不起作用,例如x > 2

有什么建议可以做到这一点吗?感谢

3 个答案:

答案 0 :(得分:3)

您所描述的内容非常接近data.table功能,您可以根据特定条件提供要更新的列和值列表(通过引用,即无需复制):

library(data.table)
dt <- as.data.table(df) # or use setDT(df)
dt[cat == "a", `:=`(foo = 1, bar = "three")]
dt[cat == "b", `:=`(foo = 2, bar = "four")]

答案 1 :(得分:1)

我建议使用join方法但使用中间列:

library(dplyr)
df <- data_frame(cat = c(1L, 2L, 3L, 4L))
otherdf <- data_frame(j=c('a1','a2','a99'), foo=11:13, bar=c('three','four','five'))

df %>%
  mutate(
    j = case_when(
      cat == 1L ~ 'a1',
      cat == 2L ~ 'a2',
      cat > 2L ~ 'a99'
    )) %>%
  left_join(otherdf, by = 'j')
# # A tibble: 4 × 4
#     cat     j   foo   bar
#   <int> <chr> <int> <chr>
# 1     1    a1    11 three
# 2     2    a2    12  four
# 3     3   a99    13  five
# 4     4   a99    13  five

(然后你可以使用select(-j)清理它。)

答案 2 :(得分:0)

取决于整个事物的可扩展性。可能值得一看:

require(tidyverse)
df <- tibble(cat = c("a", "b", "a", "a", "c"))

# create single case_when
make_fun <- function(values) {
  trans_fun <- function(x) {
    case_when(x == "a" ~ values[[1]],
              x == "b" ~ values[[2]],
              TRUE ~ values[[3]])
  }
}

# create all case_whens
fun_list <- list(
  foo = make_fun(list(1, 2, NA_real_)),
  bar = make_fun(list("three", "four", NA_character_)))

# join is not really necessary
df %>%
  bind_cols(map(fun_list, 
                function(f) f(df %>%
                                select(cat))))