Question

当使用命名向量指定列时，dplyr 0.7.5中的

select（）会从dplyr 0.7.4返回不同的结果。

library(dplyr)                               
df <- data.frame(a = 1:5, b = 6:10, c = 11:15)
print(df)                                     
#>   a  b  c
#> 1 1  6 11
#> 2 2  7 12
#> 3 3  8 13
#> 4 4  9 14
#> 5 5 10 15

# a named vector
cols <- c(x = 'a', y = 'b', z = 'c')          
print(cols)                                   
#>  x   y   z 
#> "a" "b" "c"

# with dplyr 0.7.4
# returns column names with vector values
select(df, cols)                              
#>   a  b  c
#> 1 1  6 11
#> 2 2  7 12
#> 3 3  8 13
#> 4 4  9 14
#> 5 5 10 15

# with dplyr 0.7.5
# returns column names with vector names
select(df, cols)                              
#>   x  y  z
#> 1 1  6 11
#> 2 2  7 12
#> 3 3  8 13
#> 4 4  9 14
#> 5 5 10 15

这是一个错误还是一个功能？

Answer 1

IMO可能被认为是0.7.4 中的错误，现在已经修复/更加用户友好。

随着向tidyselect的转移，逻辑变得更加复杂。如果您将dplyr::select_vars与新tidyselect::vars_select进行比较（这些是dplyr:::select.data.frame分别在0.7.4和0.7.5中使用的变体），您会发现以下行丢失了名称对于命名＆amp;在0.7.4引用（字符串）的情况：

ind_list <- map_if(ind_list, is_character, match_var, table = vars)

# example:
dplyr:::select.data.frame(mtcars, c(a = "mpg", b = "disp"))

请注意，通常不是指定向量的问题，因为典型的非引用案例总是很好：

dplyr:::select.data.frame(mtcars, c(a = mpg, b = disp))
# (here the names are indeed "a" and "b" afterwards)

有一行代码可以处理c()的使用：

ind_list <- map_if(ind_list, !is_helper, eval_tidy, data = names_list)

eval_tidy来自rlang包，在上面的行中会返回以下内容以进行有问题的调用：

[[1]]
 a      b 
 "mpg" "disp"

现在使用tidyselect，我们有一些额外的处理，请参阅https://github.com/tidyverse/tidyselect/blob/master/R/vars-select.R。

特别是vars_select_eval有以下行，它处理c()的用法：

ind_list <- map_if(quos, !is_helper, overscope_eval_next, overscope = overscope)

overscope_eval_next再次来自rlang包并调用与eval_tidy相同的例程，但它会收到c()的 overcope 变体处理字符串（通过overscope参数）。见tidyselect:::vars_c。因此，在此行之后，c(a = "mpg", b = "disp")案例与c(a = mpg, b = disp)相同：

[[1]]
a b   # these are the names
1 3   # these are the positions of the selected cols

is_character在后续代码中不再存在，而不是上面的rlang::eval_tidy。

如果您在rlang中查看这些功能，overscope_eval_next被软弃用以支持eval_tidy这一事实可能会让您感到困惑。但是在这里我猜tidyselect还没有被“清理”了（命名不一致等也必须解决，所以这是一个重写不仅仅是一行调用）。但最终eval_tidy现在可以以相同的方式使用，可能会使用。{/ p>

dplyr 0.7.5更改select（）行为

1 个答案: