R data.table按创建列分组

时间:2018-07-03 14:13:47

标签: r data.table

我是真棒data.table包的新手,并且遇到了一个希望可以提供简单解决方案的问题。我想过滤data.table,向该data.table添加一些列,并按该data.table中的某些列进行分组,包括我在{ {1}}子句。

如果我使用的是j,它会变成这样:

dplyr

我尝试使用library(dplyr) mtcars %>% filter(vs == 1) %>% mutate(trans = ifelse(am == 1, "Manual", "Auto")) %>% group_by(gear, carb, trans) %>% summarise(num_cars = n(), avg_qsec = mean(qsec)) # A tibble: 6 x 5 # Groups: gear, carb [?] gear carb trans num_cars avg_qsec <dbl> <dbl> <chr> <int> <dbl> 1 3 1 Auto 3 19.9 2 4 1 Manual 4 19.2 3 4 2 Auto 2 21.4 4 4 2 Manual 2 18.6 5 4 4 Auto 2 18.6 6 5 2 Manual 1 16.9 无效。

data.table

所以我在library(data.table) dtmt <- as.data.table(mtcars) dtmt[vs == 1, .(num_cars = .N, avg_qsec = mean(qsec), trans = ifelse(am == 1, "Manual", "Auto")), by = list(gear, carb, trans)] Error in eval(bysub, xss, parent.frame()) : object 'trans' not found 子句中创建的列不能在j中使用?如果我不尝试转换by列,则效果很好。

am

谢谢!

2 个答案:

答案 0 :(得分:1)

我们在过滤了“ vs”为1的行之后创建了“ trans”列。然后,将其用作分组变量进行汇总

dtmt[vs==1 # subset the rows
    ][, trans := c("Auto", "Manual")[(am==1)+1] # create trans
     ][, .(num_cars = .N, avg_qsec = mean(qsec)), by = .(gear, carb, trans)]

答案 1 :(得分:1)

可以在一个[]中完成所有事情:

as.data.table(mtcars)[
    vs == 1,
    .(num_cars = .N, avg_qsec = mean(qsec)),
    by = .(gear, carb, trans = ifelse(am == 1, "Manual", "Auto"))]

#    gear carb  trans num_cars avg_qsec
# 1:    4    1 Manual        4    19.22
# 2:    3    1   Auto        3    19.89
# 3:    4    2   Auto        2    21.45
# 4:    4    4   Auto        2    18.60
# 5:    4    2 Manual        2    18.56
# 6:    5    2 Manual        1    16.90