使用传递给递归函数的数据表优化R循环

时间:2018-05-29 20:37:47

标签: r for-loop optimization data.table

我正在尝试运行一个模拟,该模拟会为n_products运行n_tests n_years,以估计需求的增加和随后存储的托盘的增加(假设需求和存储产品之间的线性关系)。为了使事情变得更加刺激,需求来自2个不同的区域(A和B),但产品存储在一个仓库中。

我目前所做的工作有效,但速度很慢。 10年,200次测试和25,000种产品需要10秒才能运行。

设置:

library(data.table)

n_products <- 25000
n_years <- 10
n_tests <- 200

pct_error <- 2
A_fcst <- runif(n_years, min = 1, max = 8)
B_fcst <- runif(n_years, min = 3, max = 6)

填充初始DT和矩阵:

yearly_demand_A <- matrix(0, n_years, n_tests)
yearly_demand_B <- matrix(0, n_years, n_tests)

for (i in 1:n_years){
  yearly_demand_A[i,] <- rnorm(n_tests, A_fcst[i], pct_error*sqrt(i))
  yearly_demand_B[i,] <- rnorm(n_tests, B_fcst[i], pct_error*sqrt(i))
}

yearly_pallets <- matrix(0, n_years, n_tests)

demand_x_pallets <- data.table(prod_code = 1:n_products, stock_qty = as.integer(runif(n_products,1,100)), pallet_qty = as.integer(runif(n_products,10,30)), demand_A = runif(n_products,1,40), demand_B = runif(n_products,1,40))
demand_x_pallets[,pallets := ceiling(stock_qty/pallet_qty)]
demand_x_pallets[,demand := demand_A + demand_B]

for (i in 1:n_tests){
  yearly_pallets[1:n_years,i] <- number_of_pallets(yearly_demand_A[1:n_years,i], yearly_demand_B[1:n_years,i], demand_x_pallets)
}

功能本身:

number_of_pallets <- function(fcst_A,fcst_B,d_x_p,year=0){
  pallets <- vector("double",n_years)
  new_profile <- copy(d_x_p)    #if I don't create a copy, the same DT is passed and number of pallets compunds
  if (year == 0){               #if function called without year argument call it recursively
    for(i in 1:(n_years)){
      new_profile <- number_of_pallets(fcst_A[[i]],fcst_B[[i]],new_profile,i)
      pallets[i] <- new_profile[,sum(pallets)]
    }
  }
  else{                         #calculate demand and pallet count for each product each year
    d_x_p[,demand_A := demand_A * (100+fcst_A) / 100]
    d_x_p[,demand_B := demand_B * (100+fcst_B) / 100]
    d_x_p[,new_Dmnd := demand_A + demand_B]
    d_x_p[,Dmnd_change := ifelse(demand==0,1,new_Dmnd/demand)]
    d_x_p[,stock_qty := stock_qty * Dmnd_change]
    d_x_p[,pallets := ceiling(stock_qty/pallet_qty)]
    d_x_p[,demand := new_Dmnd]
    return(d_x_p)
  }
  return(pallets)
}

最初,我认为复制DT可能是使其变慢的原因,但除了使其无法正常工作之外,删除函数中的行并没有任何区别。这是我在经历了几次悲惨的失败之后出现的最好的,但我现在完全被困住了。

关于如何以不同方式解决它的任何指示都将非常感激。

0 个答案:

没有答案