r中的聚合函数不适用于我的数据集

时间:2017-09-14 00:09:19

标签: r aggregate-functions

示例数据集

Date   Playerid    Revenue Promo   DayofWeek
01/01/2017  146123  0   B   Sunday
01/01/2017  219378  0   B   Sunday
01/01/2017  198614  0   B   Sunday
02/01/2017  292640  30  A   Monday
02/01/2017  139562  10  A   Monday
02/01/2017  124967  20  A   Monday
02/01/2017  107954  20  A   Monday
03/01/2017  28391   10  B   Tuesday
03/01/2017  184388  21  B   Tuesday
03/01/2017  264222  20  B   Tuesday
03/01/2017  184857  0   B   Tuesday
04/01/2017  79788   40  A   Wednesday

我想通过DayofWeek汇总表格,并总结一周中每一天的收入,计算使用playerid的玩家数量,以便我的最终输出如下:

 
Players Revenue Promo   DayofWeek
    3      0      B       Sunday
    4     80      A       Monday
    4     51      B       Tuesday
    1     40      A       Wednesday

我一直在尝试聚合上面附带的数据集,但所有尝试都没有成功。你能帮忙吗?

以下是我的代码。

aggdata <-aggregate(MyData, by=list(DayofWeek,Revenue, Promo, Playerid), 
                    FUN=sum, na.rm=TRUE)

我收到以下错误

Error in FUN(X[[i]], ...) : invalid 'type' (character) of argument

2 个答案:

答案 0 :(得分:0)

这是因为您通过除Date之外的所有内容进行聚合,因此sum函数正在尝试将这些日期字符串相加。尝试总结这样的收入:

aggdata <-aggregate(MyData, by=list(DayofWeek, Date, Promo, Playerid), 
                FUN=sum, na.rm=TRUE)

或者,根据您所说的,您想忘记日期:

aggdata <-aggregate(. ~ Dayofweek + Promo + Playerid, data = MyData[,-2:5], sum)

答案 1 :(得分:0)

dplyr方法

library(dplyr)
ans <- df %>%
  group_by(DayofWeek) %>%
  summarise(Promo=unique(Promo), Revenue=sum(Revenue), Playerid=n())

输出

  DayofWeek Promo Revenue Playerid
      <chr> <chr>   <int>    <int>
1    Monday     A      80        4
2    Sunday     B       0        3
3   Tuesday     B      51        4
4 Wednesday     A      40        1

数据

df <- structure(list(Date = c("01/01/2017", "01/01/2017", "01/01/2017", 
"02/01/2017", "02/01/2017", "02/01/2017", "02/01/2017", "03/01/2017", 
"03/01/2017", "03/01/2017", "03/01/2017", "04/01/2017"), Playerid = c(146123L, 
219378L, 198614L, 292640L, 139562L, 124967L, 107954L, 28391L, 
184388L, 264222L, 184857L, 79788L), Revenue = c(0L, 0L, 0L, 30L, 
10L, 20L, 20L, 10L, 21L, 20L, 0L, 40L), Promo = c("B", "B", "B", 
"A", "A", "A", "A", "B", "B", "B", "B", "A"), DayofWeek = c("Sunday", 
"Sunday", "Sunday", "Monday", "Monday", "Monday", "Monday", "Tuesday", 
"Tuesday", "Tuesday", "Tuesday", "Wednesday")), .Names = c("Date", 
"Playerid", "Revenue", "Promo", "DayofWeek"), row.names = c(NA, 
-12L), class = c("data.table", "data.frame"))