如何在dplyr mutate()函数中使用case_when?

时间:2017-03-10 10:44:50

标签: r dplyr

据我了解,case_when()ifelse()的通用版本。

但是,我不明白如何在dplyr::mutate()函数中使用此函数。它曾经使用github的最后一个版本的dplyr工作但是因为我回到了CRAN版本(0.5),它不再起作用了。有人有线索吗?

这是我可重复的例子:


library(devtools)
library(tibble)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

tibble(
  group = c("A", "A", "A", "B", "B", "B"), 
  x = 1:6
  ) %>% 
  mutate(
    y = ifelse(group == "A", -x, x)
    )
#> # A tibble: 6 × 3
#>   group     x     y
#>   <chr> <int> <int>
#> 1     A     1    -1
#> 2     A     2    -2
#> 3     A     3    -3
#> 4     B     4     4
#> 5     B     5     5
#> 6     B     6     6

tibble(
  group = c("A", "A", "A", "B", "B", "B"), 
  x = 1:6
  ) %>% 
  mutate(
    y = case_when(
      group == "A" ~ -x, 
      TRUE ~ x
      )
  )
#> Error in mutate_impl(.data, dots): object 'group' not found

tibble(
  group = c("A", "A", "A", "B", "B", "B"), 
  x = 1:6
  ) %>%
  mutate_(
    .dots = list(
      "y" = lazyeval::interp( 
        ~ case_when(var1 == "A" ~ -var2, TRUE, var2), 
        var1 = as.name(group), 
        var2 = as.name(x)
        )
    )
  )
#> Error in as.name(group): object 'group' not found



devtools::session_info()
#> Session info -------------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.3.2 (2016-10-31)
#>  system   x86_64, linux-gnu           
#>  ui       X11                         
#>  language (EN)                        
#>  collate  C                           
#>  tz       <NA>                        
#>  date     2017-03-10
#> Packages -----------------------------------------------------------------
#>  package    * version     date       source                            
#>  assertthat   0.1         2013-12-06 CRAN (R 3.2.2)                    
#>  backports    1.0.5       2017-01-18 cran (@1.0.5)                     
#>  DBI          0.6         2017-03-09 cran (@0.6)                       
#>  devtools   * 1.12.0.9000 2017-03-07 Github (hadley/devtools@d8ab190)  
#>  digest       0.6.12      2017-01-27 cran (@0.6.12)                    
#>  dplyr      * 0.5.0       2016-06-24 CRAN (R 3.3.2)                    
#>  evaluate     0.10        2016-10-11 cran (@0.10)                      
#>  htmltools    0.3.5       2016-03-21 CRAN (R 3.2.3)                    
#>  knitr        1.15.1      2016-11-22 cran (@1.15.1)                    
#>  lazyeval     0.2.0.9000  2016-10-14 Github (hadley/lazyeval@c155c3d)  
#>  magrittr     1.5         2014-11-22 CRAN (R 3.2.3)                    
#>  memoise      1.0.0.9001  2017-02-13 Github (hadley/memoise@884d565)   
#>  pkgbuild     0.0.0.9000  2017-03-07 Github (r-pkgs/pkgbuild@65eace0)  
#>  pkgload      0.0.0.9000  2017-03-07 Github (r-pkgs/pkgload@fc907a1)   
#>  R6           2.2.0       2016-10-05 cran (@2.2.0)                     
#>  Rcpp         0.12.9      2017-01-14 cran (@0.12.9)                    
#>  rmarkdown    1.3.9004    2017-03-09 Github (rstudio/rmarkdown@01dc037)
#>  rprojroot    1.2         2017-01-16 cran (@1.2)                       
#>  stringi      1.1.2       2016-10-01 CRAN (R 3.3.1)                    
#>  stringr      1.2.0       2017-02-18 cran (@1.2.0)                     
#>  tibble     * 1.2         2016-08-26 CRAN (R 3.2.3)                    
#>  withr        1.0.2       2016-06-20 CRAN (R 3.2.3)                    
#>  yaml         2.1.14      2016-11-12 cran (@2.1.14)

4 个答案:

答案 0 :(得分:2)

dplyr已更新,因此代码现在可以正常使用而无需使用。$或转换为data.table - 请参阅https://github.com/tidyverse/dplyr/issues/1965

答案 1 :(得分:1)

这有效

tibble(
  group = c("A", "A", "A", "B", "B", "B"), 
  x = 1:6
) %>% 
  mutate(y = case_when(.$group == "A" ~ -.$x, 
                       TRUE ~ .$x))

答案 2 :(得分:0)

将数据帧转换为data.table允许case_when按预期运行。

appendToGrowingArray(&testArray, testInteger);
appendToGrowingArray(&testArray, anotherInteger);

答案 3 :(得分:0)

我发现了一个特殊的工作 - 看起来管道数据框变异而不是包含它作为第一个参数允许你使用case_when并且它按预期工作。

例如,这有效:

df %>% mutate(new_var = case_when(old_var == 1 ~ TRUE,
                                  TRUE ~ FALSE)

这不起作用:

mutate(df, new_var = case_when(old_var == 1 ~ TRUE,
                               TRUE ~ FALSE)

(使用R 3.3.1,dplyr 0.5.0)