我有一个大数据框,我用它运行:
dcast(mydata, People ~ Categories, value.var = "Answer Count", **sum**)
这就是结果:
People category1 category2
Marge 3,648 6,402
Homer 3,586 6,684
Bart 3,469 7,119
Lisa 4,045 6,758
Maggie 2,847 5,748
另外,这个:
dcast(mydata, People ~ Categories, value.var = "Answer Count", **length**)
做到这一点:
People category1 category2
Marge 2,531 4,516
Homer 2,535 4,512
Bart 2,542 4,563
Lisa 2,501 4,488
Maggie 2,517 4,513
实际上,我想这样做:
dcast(mydata, People ~ Categories, value.var = "Answer Count", **sum / length / 6**)
并获取这些值:
People category1 category2
Marge 0.240221256 0.236271036
Homer 0.235765943 0.246897163
Bart 0.227445581 0.260026298
Lisa 0.269558843 0.250965538
Maggie 0.188518077 0.212275648
我曾尝试操纵fun.aggregate
作为参数,但我不确定这是正确的道路,或者我不知道我在做什么。有人可以帮我这个吗? (旁注:此示例有两个类别。真实数据有> 40。)
答案 0 :(得分:1)
我们可以在fun.aggregate
library(reshape2)
dcast(mydata, People ~ Categories,
value.var = "Answer Count", fun.aggregate = function(x) sum(x)/length(x))