我有一个数据框,其中包含有关销售分支,客户和销售的信息。
branch <- c("Chicago","Chicago","Chicago","Chicago","Chicago","Chicago","LA","LA","LA","LA","LA","LA","LA","Tampa","Tampa","Tampa","Tampa","Tampa","Tampa","Tampa","Tampa")
customer <- c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21)
sales <- c(33816,24534,47735,1467,39389,30659,21074,20195,45165,37606,38967,41681,47465,3061,23412,22993,34738,19408,11637,36234,23809)
data <- data.frame(branch, customer, sales)
我需要完成的是迭代每个分支,将每个客户带入分支机构,并将该客户的销售额除以分支机构的总数。我需要这样做,以了解每个客户对相应分支的总销售额贡献了多少。例如。对于客户1,我想划分33816/177600并将此值存储在新列中。 (177600是芝加哥分公司的总数)
我曾尝试编写一个函数来遍历for循环中的每一行,但我不知道如何在分支级别执行此操作。任何指导表示赞赏。
答案 0 :(得分:0)
我们可以使用Map<String, Collection<Map<String, String>>>
和{
A1: [
{Item Number: "1234",Tax Code: "1"},
{Item Number: "2345",Tax Code: "2"},
{Item Number: "1234",Tax Code: "1"}
],
B2: [
{Store Number: "111",Status: "2"},
{Store Number: "222",Status: "3"}
]
}
来计算dplyr::group_by
的总销售额。
dplyr::mutate
答案 1 :(得分:0)
考虑基线R ave
用于内联汇总的新列,该列也考虑同一客户在同一分支内有多个记录:
data$customer_contribution <- ave(data$sales, data$customer, FUN=sum) /
ave(data$sales, data$branch, FUN=sum)
data
# branch customer sales customer_contribution
# 1 Chicago 1 33816 0.190405405
# 2 Chicago 2 24534 0.138141892
# 3 Chicago 3 47735 0.268778153
# 4 Chicago 4 1467 0.008260135
# 5 Chicago 5 39389 0.221784910
# 6 Chicago 6 30659 0.172629505
# 7 LA 7 21074 0.083576241
# 8 LA 8 20195 0.080090263
# 9 LA 9 45165 0.179117441
# 10 LA 10 37606 0.149139610
# 11 LA 11 38967 0.154537126
# 12 LA 12 41681 0.165300433
# 13 LA 13 47465 0.188238887
# 14 Tampa 14 3061 0.017462291
# 15 Tampa 15 23412 0.133560003
# 16 Tampa 16 22993 0.131169705
# 17 Tampa 17 34738 0.198172193
# 18 Tampa 18 19408 0.110718116
# 19 Tampa 19 11637 0.066386372
# 20 Tampa 20 36234 0.206706524
# 21 Tampa 21 23809 0.135824795
或者不那么罗嗦:
data$customer_contribution <- with(data, ave(sales, customer, FUN=sum) /
ave(sales, branch, FUN=sum))