我已经在R中编写了一个脚本,用于模拟库存进出仓库的流程:
set.seed(10)
#Create dataframe
df1 <- data.frame(date = seq(1,20),
#Stock in to warehouse on date
stockIn = round(10+10*runif(10),0),
#Stock out of warehouse on date
stockOut = round(10+10*runif(10),0))
#The initial inventory level of the warehouse on date 1
initBalance <- 20
#Create a column of NAs which holds the end of day stock level
df1$endStockBalance <- NA
#Loop through each day
for(i in 1:nrow(df1)){
#If it's the first day, put initBalance into endStockBalance
if(i == 1){
df1[i,4] <- initBalance
#For other days, take the maximum of the previous day's inventory plus the difference between stock in and stock out, and 0 (we can't have negative stock levels)
} else {
df1[i,4] <- max(df1[i-1,4] + df1[i,2] - df1[i,3],0)
}
}
这与for循环一起使用,但是我想知道通过向量化是否有一种更优雅的方法,因为这对于较小的列表来说很好,但是对于较大的列表来说会很慢。
我曾经考虑在lag
中使用dplyr
,但是由于脚本的逐步特性使它无法正常工作。
答案 0 :(得分:3)
您基本上可以将循环更改为
cumsum(c(initBalance, df1$stockIn[-1] - df1$stockOut[-1]))
#[1] 20 17 20 21 18 16 18 18 20 16 14 11 14 15 12 10 12 12 14 10
与您运行endStockBalance
循环后得到的for
相同
identical(df1$endStockBalance,
cumsum(c(initBalance, df1$stockIn[-1] - df1$stockOut[-1])))
#[1] TRUE
如果要为负值分配0,则可以使用pmax
pmax(cumsum(c(initBalance, df1$stockIn[-1] - df1$stockOut[-1])), 0)