在使用data.table创建data.frame时,添加忽略by = x的“Overall”行

时间:2014-03-25 15:35:37

标签: r dataframe data.table

我正在创建data.frames,使用data.table按不同时间段(星期几,时间等)汇总列。

使用by = x,显然很容易获得每天产量的平均销售额。但是,我就像包含每种产品的整体销售平均值的第一行一样。

所以,例如:

   DayofWeek    Sales
    Sunday  -0.32632766
    Sunday  -1.39525094
    Sunday  -0.17669726
    Sunday  0.85023421
    Sunday  0.86486582
    Monday  -0.09989301
    Monday  0.76727639
    Monday  -1.67428010
    Tuesday 0.07731930
    Tuesday -0.49833578
    Tuesday -1.30299674
    Tuesday 0.15315193

(这里是dput():

structure(list(DayofWeek = structure(c(4L, 4L, 4L, 4L, 4L, 2L, 
2L, 2L, 6L, 6L, 6L, 6L, 7L, 7L, 5L, 5L, 5L, 1L, 1L, 3L, 3L, 3L, 
3L), .Label = c("Friday", "Monday", "Saturday", "Sunday", "Thursday", 
"Tuesday", "Wednesday"), class = "factor"), Sales = c(-0.326327663381262, 
-1.39525093919452, -0.176697258416924, 0.850234206155951, 0.864865815846249, 
-0.0998930060078245, 0.767276394000856, -1.67428009516407, 0.0773192989619049, 
-0.49833577988136, -1.30299673837641, 0.153151927466779, -0.166978329772809, 
-0.365253835027482, -0.59213504129638, -0.637757052094623, 0.296006778141631, 
-0.561833927961962, 0.279092660752442, 1.0474353590513, 1.72519764838123, 
0.343084207813727, 2.00191818865667)), .Names = c("DayofWeek", 
"Sales"), row.names = c(NA, -23L), class = "data.frame")

我可以这样做

mysample.dt<-as.data.table(sample)

mysales.day<-mysample.dt[,list(MeanSales=mean(Sales)),by=DayofWeek]

得到这个

    DayofWeek   MeanSales
    Sunday      -0.03663517
    Monday      -0.33563224
    Tuesday     -0.39271532
    Wednesday   -0.26611608
    Thursday    -0.31129511
    Friday      -0.14137063
    Saturday    1.27940885

然后我可以运行上面没有by = x来生成一个整体均值,然后将这两个data.frames组合在一起。

但是,有没有办法在我原来的论点中做到这一点?

因此输出为:

DayofWeek   MeanSales
Overall     0.02642795
Sunday      -0.03663517
Monday      -0.33563224
Tuesday     -0.39271532
Wednesday   -0.26611608
Thursday    -0.31129511
Friday      -0.14137063
Saturday    1.27940885

无需分两步创建它?

1 个答案:

答案 0 :(得分:1)

我不确定这是否属于一步解决方案。

rbind(mysample.dt[, list(DayofWeek = "Overall", MeanSales = mean(Sales))],
      mysample.dt[, list(MeanSales = mean(Sales)), by = DayofWeek])