我想通过分组变量 AND 通过多个函数聚合R中data.table的值。保留相应行中其他列(不包括在聚合中)的信息( =与聚合相同的行)。一个例子:
注意:代码使用此which_quantile()-function(在其代码中使用sort(x)而不是order(x))。它找到一个接近定义的分位数的数据集的实际值。
# sample data
dt <- structure(list(State = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L),
.Label = c("AK", "RI"), class = "factor"), Company = structure(1:8, .Label = c("A",
"B", "C", "D", "E", "F", "G", "H"), class = "factor"), Employees = c(82L,
104L, 37L, 24L, 19L, 118L, 88L, 42L), Number=c(1L,2L,3L,4L,5L,6L,7L,8L), Number2=c(9,10,11,12,13,14,15,16)),
.Names = c("State", "Company", "Employees", "Number", "Number2"), class = "data.frame", row.names = c(NA, 8L))
require(data.table)
setDT(dt)
# aggregation
agg <- dt[ , .(max = max(Employees),
min = min(Employees),
quantile70 = which.quantile(Employees, 0.7)), by=State]
agg_m <- dt[agg, on="State"]
聚合DT会产生以下输出:
State max min quantile70
1: AK 104 24 82
2: RI 118 19 88
将聚合与原始DT合并到:
State Company Employees Number Number2 max min quantile70
1: AK A 82 1 9 104 24 82
2: AK B 104 2 10 104 24 82
3: AK C 37 3 11 104 24 82
4: AK D 24 4 12 104 24 82
5: RI E 19 5 13 118 19 88
6: RI F 118 6 14 118 19 88
7: RI G 88 7 15 118 19 88
8: RI H 42 8 16 118 19 88
问题:如何汇总data.table,同时在Company,Number和Number2列中保留相应的值?最大状态AK中的数字列为104,第二列中的对应值为10.最小值为24,对应值为12,依此类推。聚合data.table时如何保留这些信息?
所需的输出:
State Company Employees Number Number2 aggregation
1: AK A 82 1 9 quantile70
2: AK B 104 2 10 max
3: AK D 24 4 12 min
4: RI E 19 5 13 min
5: RI F 118 6 14 max
6: RI H 88 8 16 quantile70
问题类似于this one。样本数据也从那里获取并进行了调整。
以下汇总无法解决我的问题:
dt[ ,.SD[ which.max(Employees) ], by=State]
dt[dt[ ,.I[ which.max(Employees) ], by=State ]$V1]
# only which.max() OR which.min() are possible
dt[ , max_Empl := max(Employees), by=State ]
# only ONE aggregation function at a time is possible
答案 0 :(得分:0)
关于按群组进行子集化的@eddi's canonical answer ...
aggi <- dt[ , .(max = .I[which.max(Employees)],
min = .I[which.min(Employees)],
quantile70 = .I[which.quantile(Employees, 0.7)]), by=State]
从这里,你可以做到
maggi <- melt(aggi, id="State")
dt[maggi$value][, v := maggi$variable][]
State Company Employees Number Number2 v
1: AK B 104 2 10 max
2: RI F 118 6 14 max
3: AK D 24 4 12 min
4: RI E 19 5 13 min
5: AK A 82 1 9 quantile70
6: RI G 88 7 15 quantile70