Question

我经常使用data.table的子集的子集，并且通常能够以某种方式从初始子集中选择所有列然后添加在第二子集中计算的新列通常会很棒，基本上直接相当于在任何SQL DBMS中的SELECT语句中使用*。

d <- read.table(text="Fuel     Year   Region   Count
Gasoline 2013       GE  169600
                Diesel   2013       GE   46790
                Hybrid   2013       GE    2268
                Electric 2013       GE      85
                Other    2013       GE     532
                Gasoline 2013       VS  149232
                Diesel   2013       VS   50591
                Hybrid   2013       VS    1028
                Electric 2013       VS     268
                Other    2013       VS     261", header = TRUE)


d <- data.table(d)

一种可能性是在第二个子集:=中使用[]分配，然后添加第三级方括号[]。

 # Example with one additional column
 d[, .(ct = sum(Count)), by=Fuel][, s := ct/sum(ct)][]
 # Example with two additional columns
 d[, .(ct = sum(Count)), by=Fuel][, `:=`(s1 = ct/sum(ct)
                                         ,s2 = ct/sum(ct)+1 )][]

是否有可能通过对上表中的所有列使用某种占位符来获得相同的结果？

相当于＆＃39; SELECT * FROM tbl＆＃39;在data.table中

0 个答案: