这对我来说是一个很大的难题。如果我有足够的声誉来奖励赏金,我会的!
寻求平衡销售代表的帐户区域。我把这个过程分解了,我真的不知道如何在每个地区做到这一点。
在此示例中,有4个地区的1000个帐户,每个地区有2个子集联盟,然后是帐户的各个所有者 - 某些帐户是无主的。每个帐户的随机值介于1,000和100,000之间。
可重复的例子:
帐户清单:
set.seed(1)
Accounts <- paste0("Acc", 1:1000)
Region <- c("NorthEast", "SouthEast", "MidWest", "West")
League <- sample(c("Majors", "Minors"), 1000, replace = TRUE)
AccValue <- sample(1000:100000, 1000, replace = TRUE)
Owner <- sample(c("Chad", NA, "Jimmy", "Adrian", NA, NA, "Steph", "Matt", "Jared", "Eric"), 1000, replace = TRUE)
AccDF <- data.frame(Accounts, Region, League, AccValue, Owner)
AccDF$Accounts <- as.character(AccDF$Accounts)
AccDF$Region <- as.character(AccDF$Region)
AccDF$League <- as.character(AccDF$League)
AccDF$Owner <- as.character(AccDF$Owner)
地区所有权概要:
Summary <- AccDF %>%
group_by(Region, League, Owner) %>%
summarise(Count = n(),
TotalValue = sum(AccValue))
按地区划分的联盟:
Summary2 <- AccDF %>%
group_by(Region, League) %>%
summarise(Count = n(),
TotalValue = sum(AccValue),
AccountsPerRep = round(Count / 7, 0),
ValuePerRep = TotalValue / 7)
这就是所有的起始数据,我想对Summary2表的每个分组执行以下过程。
West Minors示例:
西部未成年人账户总数:120
#break out into owned and unowned
WestMinorsOwned <- AccDF %>%
filter(Region == "West",
League == "Minors",
!is.na(Owner))
WestMinorsUnowned <- AccDF %>%
filter(Region == "West",
League == "Minors",
is.na(Owner))
#unassign accounts until threshold is hit
New.WestMinors <- WestMinorsOwned %>%
mutate(r = runif(n())) %>%
arrange(r) %>%
group_by(Owner) %>%
mutate(NewOwner = replace(Owner, cumsum(AccValue) > 600000 | row_number() > 14, NA)) %>%
ungroup(Owner) %>%
mutate(Owner = NewOwner) %>%
select(-r, -NewOwner)
在更新所有者之后,我们将这些部分绑定在一起以拥有WestMinors帐户基础,所有帐户都包含更新的所有者,希望平衡。
AssignableWestMinors <- bind_rows(filter(AccDF, Region == "West" & League == "Minors" & is.na(Owner)),
filter(New.WestMinors, is.na(Owner))) %>%
arrange(desc(AccValue))
#check work
OwnerSummary <- New.WestMinors %>%
filter(!is.na(Owner)) %>%
group_by(Region, League, Owner) %>%
summarise(Count = n(), TotalValue = sum(AccValue))
没有人拥有超过14个账户或600,000个账户,因此我们处于一个好地方,可以开始重新分配无主账户以尝试平衡所有人。以下for循环查看OwnerSummary中每个名称分配给他们的最小$$并分配最有价值帐户的名称,然后移动每个帐户,尝试平衡每个所有者的共享。
#Balance Unassigned
for (i in 1:nrow(AssignableWestMinors)){
idx <- which.min(OwnerSummary$TotalValue)
OwnerSummary$TotalValue[idx] <- OwnerSummary$TotalValue[idx] + AssignableWestMinors$AccValue[i]
OwnerSummary$Count[idx] <- OwnerSummary$Count[idx] + 1
AssignableWestMinors$Owner[i] <- as.character(OwnerSummary$Owner[idx])}
现在我们只将先前拥有的和新分配的绑定在一起,我们拥有完成的西部未成年人平衡区域。
WestMinors.Final <- bind_rows(filter(New.WestMinors, !is.na(Owner)), AssignableWestMinors)
WM.Summary <- WestMinors.Final %>%
group_by(Region, League, Owner) %>%
summarise(Count = n(),
TotalValue = sum(AccValue))
每个人都拥有相似数量的帐户,并且总的$$领域都在合理范围内。
现在我正试图为原来的4个地区,2个联盟的每个分组做到这一点。这样做8次然后将它们拼接在一起。每个子组都有不同的$$值阈值,以及账户数量。如何将原始帐户基础拆分为8个部分,应用所有这些部分,然后将其重新组合在一起?
答案 0 :(得分:2)
您应该利用?dplyr::do
在Region-League的子集上执行您想要的split-apply-combine操作。首先,对逻辑进行功能化,使其能够在数据帧dta
上运行,该数据帧代表主数据帧AccDF
的子集化版本。
reAssign <- function(dta) {
other_acct <- dta %>%
filter(!is.na(Owner)) %>%
mutate(r = runif(n())) %>%
arrange(r) %>%
group_by(Owner) %>%
mutate(NewOwner = replace(Owner, cumsum(AccValue) > 600000 | row_number() > 14, NA)) %>%
ungroup(Owner) %>%
mutate(Owner = NewOwner) %>%
select(-r, -NewOwner)
assignable_acct <- other_acct %>%
filter(is.na(Owner)) %>%
bind_rows( filter(dta, is.na(Owner)) ) %>%
arrange(desc(AccValue))
acct_summary <- other_acct %>%
filter(!is.na(Owner)) %>%
group_by(Owner) %>%
summarise(Count = n(), TotalValue = sum(AccValue))
# I have a feeling there's a much better way of doing this, but oh well...
for (i in seq(nrow(assignable_acct))) {
idx <- which.min(acct_summary$TotalValue)
acct_summary$TotalValue[idx] <- acct_summary$TotalValue[idx] + assignable_acct$AccValue[i]
acct_summary$Count[idx] <- acct_summary$Count[idx] + 1
assignable_acct$Owner[i] <- as.character(acct_summary$Owner[idx])
}
final <- other_acct %>%
filter(!is.na(Owner)) %>%
bind_rows(assignable_acct)
return(final)
}
然后将其应用于已按地区,联盟分组的AccDF。
new_master <- AccDF %>%
group_by(Region, League) %>%
do( reAssign(.) ) %>%
ungroup()
检查以确保完成它的工作......
new_master %>%
group_by(Region, League, Owner) %>%
summarise(Count = n(),
TotalValue = sum(AccValue)) %>%
as.data.frame()