dplyr中超过39个字符的字符串失败,返回错误:"错误:索引越界"。
我错过了什么或这是一个错误吗?
library(dplyr)
names(iris)[5] <- "vvv_5vvv10vvv15vvv20vvv25vvv30vvv35vvv40"
iris %>% dplyr::group_by( vvv_5vvv10vvv15vvv20vvv25vvv30vvv35vvv40 ) %>%
dplyr::summarise( n() )
给我错误: 错误:索引越界
names(iris)[5] <- "vvv_5vvv10vvv15vvv20vvv25vvv30vvv35vv39"
iris %>% dplyr::group_by( vvv_5vvv10vvv15vvv20vvv25vvv30vvv35vv39 ) %>%
dplyr::summarise( n() )
工作正常。给了我这个(期望的)输出
Source: local data frame [3 x 2]
vvv_5vvv10vvv15vvv20vvv25vvv30vvv35vv39 n()
1 setosa 50
2 versicolor 50
3 virginica 5
SessionInfo()
> sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=Danish_Denmark.1252 LC_CTYPE=Danish_Denmark.1252 LC_MONETARY=Danish_Denmark.1252 LC_NUMERIC=C
[5] LC_TIME=Danish_Denmark.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] dplyr_0.3.0.2
loaded via a namespace (and not attached):
[1] assertthat_0.1 DBI_0.3.1 lazyeval_0.1.9 magrittr_1.0.1 parallel_3.1.1 Rcpp_0.11.3 tools_3.1.1
答案 0 :(得分:5)
这似乎是a known issue,需要在dplyr 0.3.1
中修复。来自@romainfrancois在帖子中的回复:
“它发生在这里[...]
new_groups <- lazyeval::auto_name(new_groups)
,因为:
lazyeval::auto_name
function (x, max_width = 40)
{
names(x) <- auto_names(x, max_width = max_width)
x
}
<environment: namespace:lazyeval>
“
<强>更新强>
在dplyr 0.4.0
“group_by()
支持超过39个字符的变量,这要归功于lazyeval
”中的修复:
library(dplyr)
# Variable name with 40 characters
names(iris)[5] <- "vvv_5vvv10vvv15vvv20vvv25vvv30vvv35vvv40"
iris %>%
group_by(vvv_5vvv10vvv15vvv20vvv25vvv30vvv35vvv40) %>%
summarise(n())
# vvv_5vvv10vvv15vvv20vvv25vvv30vvv35vvv40 n()
# 1 setosa 50
# 2 versicolor 50
# 3 virginica 50