我正在创建data.frames,使用data.table按不同时间段(星期几,时间等)汇总列。
使用by = x,显然很容易获得每天产量的平均销售额。但是,我也就像包含每种产品的整体销售平均值的第一行一样。
所以,例如:
DayofWeek Sales
Sunday -0.32632766
Sunday -1.39525094
Sunday -0.17669726
Sunday 0.85023421
Sunday 0.86486582
Monday -0.09989301
Monday 0.76727639
Monday -1.67428010
Tuesday 0.07731930
Tuesday -0.49833578
Tuesday -1.30299674
Tuesday 0.15315193
(这里是dput():
structure(list(DayofWeek = structure(c(4L, 4L, 4L, 4L, 4L, 2L,
2L, 2L, 6L, 6L, 6L, 6L, 7L, 7L, 5L, 5L, 5L, 1L, 1L, 3L, 3L, 3L,
3L), .Label = c("Friday", "Monday", "Saturday", "Sunday", "Thursday",
"Tuesday", "Wednesday"), class = "factor"), Sales = c(-0.326327663381262,
-1.39525093919452, -0.176697258416924, 0.850234206155951, 0.864865815846249,
-0.0998930060078245, 0.767276394000856, -1.67428009516407, 0.0773192989619049,
-0.49833577988136, -1.30299673837641, 0.153151927466779, -0.166978329772809,
-0.365253835027482, -0.59213504129638, -0.637757052094623, 0.296006778141631,
-0.561833927961962, 0.279092660752442, 1.0474353590513, 1.72519764838123,
0.343084207813727, 2.00191818865667)), .Names = c("DayofWeek",
"Sales"), row.names = c(NA, -23L), class = "data.frame")
我可以这样做
mysample.dt<-as.data.table(sample)
mysales.day<-mysample.dt[,list(MeanSales=mean(Sales)),by=DayofWeek]
得到这个
DayofWeek MeanSales
Sunday -0.03663517
Monday -0.33563224
Tuesday -0.39271532
Wednesday -0.26611608
Thursday -0.31129511
Friday -0.14137063
Saturday 1.27940885
然后我可以运行上面没有by = x来生成一个整体均值,然后将这两个data.frames组合在一起。
但是,有没有办法在我原来的论点中做到这一点?
因此输出为:
DayofWeek MeanSales
Overall 0.02642795
Sunday -0.03663517
Monday -0.33563224
Tuesday -0.39271532
Wednesday -0.26611608
Thursday -0.31129511
Friday -0.14137063
Saturday 1.27940885
无需分两步创建它?
答案 0 :(得分:1)
我不确定这是否属于一步解决方案。
rbind(mysample.dt[, list(DayofWeek = "Overall", MeanSales = mean(Sales))],
mysample.dt[, list(MeanSales = mean(Sales)), by = DayofWeek])