我有一张需求表,看起来像这样:
set.seed(1)
DTd <- data.table(loc="L1", product="P1", cust=c("C1","C2","C3"), period=c("per1","per2","per3","per4"), qty=runif(12,min=0,max=100), key=c("loc","product","cust","period"))
DTd[]
# loc product cust period qty
#1: L1 P1 C1 per1 12.97134
#2: L1 P1 C1 per2 65.37663
#3: L1 P1 C1 per3 34.21633
#4: L1 P1 C1 per4 24.23550
#5: L1 P1 C2 per1 85.68853
#6: L1 P1 C2 per2 98.22407
#7: L1 P1 C2 per3 92.24086
#8: L1 P1 C2 per4 70.62672
#9: L1 P1 C3 per1 62.12432
#10: L1 P1 C3 per2 84.08788
#11: L1 P1 C3 per3 82.67184
#12: L1 P1 C3 per4 53.63538
供应表看起来像这样:
DTs <- data.table(loc="L1", product="P1", period=c("per1","per2","per3","per4"), qty=runif(4,min=0,max=200), key=c("loc","product","period"))
DTs[]
# loc product period qty
#1: L1 P1 per1 9.23293
#2: L1 P1 per2 74.03622
#3: L1 P1 per3 133.54770
#4: L1 P1 per4 123.43913
我需要优先为相应的需求分配供应,并添加一个列“已分配”的列。到需求表。出于这个例子的目的,我们假设优先级是最小的需求。
这是我正在寻找的结果。
#loc product cust period qty alloc
#1: L1 P1 C1 per1 12.97134 9.232930
#2: L1 P1 C1 per2 65.37663 65.376625
#3: L1 P1 C1 per3 34.21633 34.216329
#4: L1 P1 C1 per4 24.23550 24.235499
#5: L1 P1 C2 per1 85.68853 0.000000
#6: L1 P1 C2 per2 98.22407 0.000000
#7: L1 P1 C2 per3 92.24086 16.659531
#8: L1 P1 C2 per4 70.62672 45.568249
#9: L1 P1 C3 per1 62.12432 0.000000
#10: L1 P1 C3 per2 84.08788 8.659591
#11: L1 P1 C3 per3 82.67184 82.671841
#12: L1 P1 C3 per4 53.63538 53.635379
我没有看到使用data.table功能有效地做到这一点的方法。我似乎被简化为循环遍历行并逐行更新使用set。 这是我在这种情况下使用的代码。
#set key on demand to match supply and order by the qty (for prioritising
setkey(DTd, loc, product, period, qty)
#add a column for the allocated quantity
DTd[,alloc:=0]
#loop through the rows of the supply, using the row number
for (s in DTs[, .I]) {
key <- DTs[s, .(loc, product, period)]
suppqty <- DTs[s, qty]
#loop through the corresponding demand and return the row number
for (d in DTd[key, which=TRUE]) {
if (suppqty == 0) break
#determine the quantity to allocate from the demand row
allocqty <- DTd[d, ifelse(qty < suppqty, qty, suppqty)]
#update the alloc qty on this row
set(DTd, d, 6L, allocqty)
#reduce the amount outstanding
suppqty <- suppqty - allocqty
}
}
#restore the original keys
setkey(DTd, loc, product, cust, period)
任何有关更好地实现此任何部分的建议都会受到高度赞赏。 (在实践中,表格非常大,优先级规则可能非常复杂,但在这种情况下,我会先进行一次确定优先级,然后在分配传递中使用它。)
答案 0 :(得分:3)
你可以做到
setnames(DTs, "qty", "suppqty")
setnames(DTd, "qty", "demqty")
setorder(DTd, loc, product, period, demqty) # put your priority column last here
DTd[DTs, alloc := {
resid_supply = shift(pmax(suppqty - cumsum(demqty), 0), fill=suppqty[1L])
pmin(demqty, resid_supply)
}, by=.EACHI, on=c("loc", "product", "period")]
结果是
loc product cust period demqty alloc
1: L1 P1 C2 per1 20.168193 20.168193
2: L1 P1 C1 per1 26.550866 26.550866
3: L1 P1 C3 per1 62.911404 62.911404
4: L1 P1 C1 per2 6.178627 6.178627
5: L1 P1 C2 per2 37.212390 37.212390
6: L1 P1 C3 per2 89.838968 33.429727
7: L1 P1 C2 per3 20.597457 20.597457
8: L1 P1 C3 per3 57.285336 57.285336
9: L1 P1 C1 per3 94.467527 76.085490
10: L1 P1 C3 per4 17.655675 17.655675
11: L1 P1 C2 per4 66.079779 66.079779
12: L1 P1 C1 per4 90.820779 15.804394
如某些软件包作者Arun, in this SO post所描述的那样,您现在通常不需要在合并之前设置密钥:
因此,在大多数情况下,不再需要设置密钥。我们建议尽可能使用
on=
,除非设置密钥在您要利用的性能方面有显着提升。
对于类似的计算(按最低价格优先购买),you can see my other answer。