示例数据集
Date Playerid Revenue Promo DayofWeek 01/01/2017 146123 0 B Sunday 01/01/2017 219378 0 B Sunday 01/01/2017 198614 0 B Sunday 02/01/2017 292640 30 A Monday 02/01/2017 139562 10 A Monday 02/01/2017 124967 20 A Monday 02/01/2017 107954 20 A Monday 03/01/2017 28391 10 B Tuesday 03/01/2017 184388 21 B Tuesday 03/01/2017 264222 20 B Tuesday 03/01/2017 184857 0 B Tuesday 04/01/2017 79788 40 A Wednesday
我想通过DayofWeek汇总表格,并总结一周中每一天的收入,计算使用playerid的玩家数量,以便我的最终输出如下:
Players Revenue Promo DayofWeek 3 0 B Sunday 4 80 A Monday 4 51 B Tuesday 1 40 A Wednesday
我一直在尝试聚合上面附带的数据集,但所有尝试都没有成功。你能帮忙吗?
以下是我的代码。
aggdata <-aggregate(MyData, by=list(DayofWeek,Revenue, Promo, Playerid),
FUN=sum, na.rm=TRUE)
我收到以下错误
Error in FUN(X[[i]], ...) : invalid 'type' (character) of argument
答案 0 :(得分:0)
这是因为您通过除Date
之外的所有内容进行聚合,因此sum
函数正在尝试将这些日期字符串相加。尝试总结这样的收入:
aggdata <-aggregate(MyData, by=list(DayofWeek, Date, Promo, Playerid),
FUN=sum, na.rm=TRUE)
或者,根据您所说的,您想忘记日期:
aggdata <-aggregate(. ~ Dayofweek + Promo + Playerid, data = MyData[,-2:5], sum)
答案 1 :(得分:0)
dplyr
方法
library(dplyr)
ans <- df %>%
group_by(DayofWeek) %>%
summarise(Promo=unique(Promo), Revenue=sum(Revenue), Playerid=n())
输出
DayofWeek Promo Revenue Playerid
<chr> <chr> <int> <int>
1 Monday A 80 4
2 Sunday B 0 3
3 Tuesday B 51 4
4 Wednesday A 40 1
数据
df <- structure(list(Date = c("01/01/2017", "01/01/2017", "01/01/2017",
"02/01/2017", "02/01/2017", "02/01/2017", "02/01/2017", "03/01/2017",
"03/01/2017", "03/01/2017", "03/01/2017", "04/01/2017"), Playerid = c(146123L,
219378L, 198614L, 292640L, 139562L, 124967L, 107954L, 28391L,
184388L, 264222L, 184857L, 79788L), Revenue = c(0L, 0L, 0L, 30L,
10L, 20L, 20L, 10L, 21L, 20L, 0L, 40L), Promo = c("B", "B", "B",
"A", "A", "A", "A", "B", "B", "B", "B", "A"), DayofWeek = c("Sunday",
"Sunday", "Sunday", "Monday", "Monday", "Monday", "Monday", "Tuesday",
"Tuesday", "Tuesday", "Tuesday", "Wednesday")), .Names = c("Date",
"Playerid", "Revenue", "Promo", "DayofWeek"), row.names = c(NA,
-12L), class = c("data.table", "data.frame"))