Question

我有以下格式的数据：

custno  TrainingType    TrainingDate    1   2   3   4   5   6
100     Presentation    2013-11-26    29.85  49.75  146.70  122.70  59.70   29.85
100     Presentation    2014-02-25    122.70 49.75  39.80   109.45  218.90  89.55
100     Training        2012-10-08    0.00   0.00   9.95    0.00    0.00    0.00
100     Training        2013-03-06    0.00   9.95   44.95   29.85   137.50  59.70

这只是示例数据，我为成千上万的{{1}}客户提供了这些数据。列custno中的数据代表按月1 through 6的每月支出。我想分离前100名消费客户。换句话说，我想要在所有月份中花费最多的前100名客户。

这是1 through 6：

的结果

dput(head(df))

有人会碰巧知道一种明智的做法吗？

非常感谢任何帮助。

Answer 1

我认为这就是你想要的？

library(dplyr)
library(tidyr)

tidydf <- gather(yourdata, month, spent, 4:9)

spendsum <- tidydf %>%
              group_by(custno) %>%
              summarise(
                totalspent = sum(spent)) %>%
              arrange(desc(totalspent))

隔离顶级消费客户

1 个答案: