Question

PPG Product Week        Sales
 P1  A      01/01/2018  50
 P1  B      01/01/2018  40
 P1  B      01/02/2018  30
 P1  A      01/02/2018  80
 P2  A      01/01/2018  100
 P2  B      01/02/2018  70

我试图找到每个PPG的总结，在这里和每个PPG中我想要获得最高销售额（整体）的产品，如下所示，

PPG   Max Product Sales
 P1      130 (This is sum of product A for ppg p1 across weeks)
 P2      100 (This is sum of product A for ppg p2 across weeks)

我已尝试在dplyr中使用top_n（1，sum（sales））来实现，但它失败了，我们怎么能解决这个问题呢？我们可以将它扩展到几周内按销售额找到前n个产品，以便检查如果80-20规则，欢迎任何想法。

Answer 1

这是使用local.additional的解决方案：

dlpyr

首先，按PPG和产品对数据进行分组，按组分类销售，然后按PPG分组，只取最大值：

library(dplyr)

输出：

my_data %>% 
  group_by(PPG, Product) %>% 
  summarise("Max Product Sales" = sum(Sales)) %>% 
  group_by(PPG) %>% 
  summarise("Max Product Sales" = max(`Max Product Sales`))

# A tibble: 2 x 2 PPG `Max Product Sales` <chr> <dbl> 1 P1 130 2 P2 100：

data.table

返回：

library(data.table)
setDT(my_data)

my_data[, .(`Max Product Sales` = sum(Sales)), by = .(PPG, Product)][, .(`Max Product Sales` = max(`Max Product Sales`)), by = PPG]

Answer 2

您没有提供任何可重现的数据，所以让我们将您的文本读入df。

Node.js

我们使用df <- read.table(text= "PPG Product Week Sales P1 A 01/01/2018 50 P1 B 01/01/2018 40 P1 B 01/02/2018 30 P1 A 01/02/2018 80 P2 A 01/01/2018 100 P2 B 01/02/2018 70",header=T)来获取PPG x Product组内的销售额。

data.table

结果是：

data.table::setDT(df)[,.(maxSales=sum(Sales)),by=c("PPG","Product")]

编辑：

   PPG Product maxSales
1:  P1       A      130
2:  P1       B       70
3:  P2       A      100
4:  P2       B       70

在R中的组中查找子组摘要

2 个答案: