dplyr :: recode_factor,具有先验未知级别

时间:2018-05-22 04:11:50

标签: r dplyr

我正在寻找一种方法来应用dplyr的recode_factor,当我想要修改的级别事先不知道时。例如,我想将cut(5)应用于某列,并将第一级(interval)调整为从0开始。

set.seed(42)

library(dplyr)
library(stringr)

x <- rgamma(100, 1)
x_cut <- x %>% cut(5)
old_level <- levels(x_cut)[[1]]
new_level <- old_level %>% str_extract_all("[0-9]+\\.([0-9]+)",simplify=TRUE) %>% `[`(2) %>% paste0("(0,",.,"]")
x_cut %>% recode_factor( old_level = new_level) %>% levels

但这似乎不起作用。

我希望看到

[1] "(0,1.38]" "(1.38,2.75]"    "(2.75,4.12]"    "(4.12,5.49]"    "(5.49,6.87]"

但没有任何改变,我得到了

[1] "(0.00388,1.38]" "(1.38,2.75]"    "(2.75,4.12]"    "(4.12,5.49]"    "(5.49,6.87]"

1 个答案:

答案 0 :(得分:1)

需要对重新编码对old_level = new_level的左侧进行评估而不是引用。

使用!!:=语法执行此操作:

x_cut %>% recode_factor(!!old_level := new_level) %>% levels

例如,使用set.seed(42)

x_cut
#  "(0.00388,1.38]" "(1.38,2.75]" "(2.75,4.12]" "(4.12,5.49]" "(5.49,6.87]"   
old_level
#  "(0.00388,1.38]"
new_level
#  "(0,1.38]"
x_cut %>% recode_factor(!!old_level := new_level) %>% levels
#  "(0,1.38]" "(1.38,2.75]" "(2.75,4.12]" "(4.12,5.49]" "(5.49,6.87]"

有关!!(&#34; bang bang&#34;)表示法的更多信息,请参阅dplyr programming docs