使用R聚合数据,遇到一点麻烦

时间:2017-05-10 20:09:44

标签: r

我的数据格式如下 -

团体 - 个人 - 购买的餐食 - 花的钱 - 约会

1 - 乔 - 3 - 25 - 星期二

1 - 简 - 2 - 40 - 星期二

1 - Joe - 4 - 50 - Sunday

2 - Sam - 3 - 60 - Sunday

2-莎莉 - 3 - 30 - 星期二

我想要做的是按群组折叠数据,以便了解所购买的膳食数量以及整个群体在特定日期所花费的金额。

我在r -

中使用以下代码
Newdata <- aggregate (data, by  = list (data$Group, data$date), FUN=sum)

不幸的是,这不起作用

3 个答案:

答案 0 :(得分:1)

这应该有效:

Newdata <- aggregate (data$meals, by  = list (data$Group, data$date), FUN=sum)

对于data.table解决方案,请尝试:

setDT(data)
data[,all_meals:= sum(meals), by = list(Group, date)]

对于多个列,我认为你可以这样做:

Newdata <- aggregate (cbind(data$meals, data$money), by  = list(data$Group, data$date), FUN=sum)

或者:

setDT(data)
data[,lapply(.SD, sum), by=list(Group, date), .SDcols=c(meals, money)

鉴于您实际上没有向我们提供任何数据,我不能100%确定它会起作用。

答案 1 :(得分:0)

尝试以下:

Newdata <- aggregate (data[,3], by  = list (data$Group, data$date), FUN=sum)

答案 2 :(得分:0)

这是一个dplyr解决方案,它也总结了两列:

library(dplyr)

data <- structure(list(Group = c(1L, 1L, 1L, 2L, 2L), individual = c("Joe",  "Jane", "Joe", "Sam", "Sally"), meals = c(3L, 2L, 4L, 3L, 3L),      money = c(25L, 40L, 50L, 60L, 30L), date = c("Tuesday", "Tuesday",      "Sunday", "Sunday", "Tuesday")), .Names = c("Group", "individual",  "meals", "money", "date"), class = "data.frame", row.names = c(NA,  -5L))

data %>%
    group_by(Group, date) %>%
    mutate(all_meals = sum(meals), tot_cost = sum(money)) %>%
    ungroup

##  # A tibble: 5 × 7
##   Group individual meals money    date all_meals tot_cost
##   <int>      <chr> <int> <int>   <chr>     <int>    <int>
## 1     1        Joe     3    25 Tuesday         5       65
## 2     1       Jane     2    40 Tuesday         5       65
## 3     1        Joe     4    50  Sunday         4       50
## 4     2        Sam     3    60  Sunday         3       60
## 5     2      Sally     3    30 Tuesday         3       30