Question

我的数据集中有多个因素（“a”，“b”，“c”），每个因素都有相应的价格和成本值。

dat <- data.frame(
 ProductCode = c("a", "a", "b", "b", "c", "c"), 
 Price = c(24, 37, 78, 45, 20, 34),
 Cost = c(10,15,45,25,10,17)
)

我正在寻找每个ProductCode的价格和成本之和。

by.code <- group_by(dat, code)
by.code <- summarise(by.code, 
                        SumPrice = sum(Price),
                        SumCost = sum(Cost))

此代码不起作用，因为它对列中的所有值求和，而不将它们分类。

  SumPrice SumCost
1      238     122

提前感谢您的帮助。

Answer 1

这不是dplyr - 如果您不介意sqldf或data.table套餐，则此答案适合您：

sqldf("select ProductCode, sum(Price) as PriceSum, sum(Cost) as CostSum from dat group by ProductCode")

ProductCode PriceSum CostSum
       a       61      25
       b      123      70
       c       54      27

使用data.table包

或：

library(data.table)
MM<-data.table(dat)
MM[, list(sum(Price),sum(Cost)), by = ProductCode]

ProductCode  V1 V2
1:           a  61 25
2:           b 123 70
3:           c  54 27

Answer 2

您的代码运行正常。只有一个错字。您应该将列ProductionCode命名为代码，并且您的代码可以正常工作。我刚刚做到了，R正在给出正确的输出。以下是代码：

library(dplyr)
dat <- data.frame(
 code = c("a", "a", "b", "b", "c", "c"), 
 Price = c(24, 37, 78, 45, 20, 34),
 Cost = c(10,15,45,25,10,17)
)
dat
by.code <- group_by(dat, code)
by.code <- summarise(by.code, 
                        SumPrice = sum(Price),
                        SumCost = sum(Cost))
by.code

Answer 3

我们可以使用aggregate

中的base R

aggregate(.~ProductCode, dat, sum)
#   ProductCode Price Cost
#1           a    61   25
#2           b   123   70
#3           c    54   27

R dplyr - 不同因子的总和值

3 个答案: