我有以下data.table
'data.frame': 66977 obs. of 16 variables:
$ SUBS : int
$ CITY : Factor w/ 18 levels
$ VALUE_SEG : Factor w/ 7 levels
$ region : Factor w/ 5 levels
$ SUM.DATA_PPU_REV_DEC. : num
$ SUM.DATA_BUNDLE_REV_DEC. : int
$ SUM.DATA_USAGE_TOTAL_KB_DEC. : num
$ SUM.THIS_MONTH_REV_DEC. : num
$ SUM.VOICE_ONNET_DURATION_DEC.: num
$ SUM.VOICE_ONNET_REV_DEC. : num
$ SUM.VOICE_OFFNET_REV_DEC. : num
$ SUM.SMS_ONNET_REV_DEC. : num
$ SUM.SMS_OFFNET_REV_DEC. : int
$ SUM.RECHARGE_DEC. : int
$ STATUS_DEC : Factor w/ 5 levels
$ TYPE_DEC_2 : Factor w/ 6 levels
我想用两个因子变量对它进行分组,让我们说VALUE_SEG&区域,得到数字的总和,并为每个因子变量创建新的库存,并带有观察数量。我尝试使用varians类型的错误聚合,ddply和其他人:(提前感谢
答案 0 :(得分:3)
以下是使用fruchterman_reingold_force_directed_layout(
g,
make_iterator_property_map(positions.begin(), boost::identity_property_map{}),
topology,
attractive_force([](Graph::edge_descriptor, double k, double d, Graph const&) { return (d*d)/k; })
);
data.table
答案 1 :(得分:1)
我建议您使用dplyr
分隔数字和因子变量并进行汇总。它可能就像
library(dplyr)
data %>% select(VALUE_SEG,region,SUM..... all numeric variables) %>%
group_by(VALUE_SEG,region) %>% summarize_each(funs(sum)) -> summary1
## For factors
data %>% select(VALUE_SEG,region,SUM..... all factors variables) %>%
group_by(VALUE_SEG,region) %>% summarize_each(funs(n)) -> summary2
## Then you can merge these results
Summary <- merge(summary1,summary2,by="VALUE_SEG")
有关使用此套件的详细信息,请访问此link