总结数据表和保守因子顺序

时间:2018-08-03 11:51:30

标签: r data.table

我正在准备要使用data.table打印的表格。

我经常使用因子来获取所需的排序,但是无法弄清楚我在data.table上做错了什么。

library(data.table)
DT <- as.data.table(iris)
DT[, Species := relevel(Species, ref = "virginica")]

# Factor levels ordered as I want them
DT[, levels(Species)]
#> [1] "virginica"  "setosa"     "versicolor"

# Table and dplyr bases its order on that
table(DT[, Species])
#> 
#>  virginica     setosa versicolor 
#>         50         50         50
suppressMessages (library(dplyr));count(DT, Species)
#> # A tibble: 3 x 2
#>   Species    `n()`
#>   <fct>      <int>
#> 1 virginica     50
#> 2 setosa        50
#> 3 versicolor    50

# data.table aggregation just cares about order of appearance?
DT[, .N, Species]
#>       Species  N
#> 1:     setosa 50
#> 2: versicolor 50
#> 3:  virginica 50

一种解决方案是使用match,但有点冗长。

DT[, .N, Species][match(levels(Species), Species)]
#>       Species  N
#> 1:  virginica 50
#> 2:     setosa 50
#> 3: versicolor 50

1 个答案:

答案 0 :(得分:4)

如果要通过keyby变量订购,只需使用by

DT[, .N, keyby = Species]
#      Species  N
#1:  virginica 50
#2:     setosa 50
#3: versicolor 50