R-parApply无法使用optim或optimx评估函数/查找参数

时间:2013-08-06 18:35:22

标签: r parallel-processing mathematical-optimization

我正在尝试最大化我为数据框中的每一行创建的函数。当我使用apply时,它工作正常,但是当我去pApply时它不起作用。我不明白为什么。

这是功能(估计半方差):

VB04 <- function(x) {
  #the argument is a vector of 2 parameters, the first is x, the second lambda

  ####I first define the function little f
  ff <- function(z) {
  ifelse (z <= x[2]*strike, return(x[1]), return(x[1]*(strike - z)/(strike*(1-x[2]))))
  }
  ####I then estimate the expected payoff of the contract
  require(pracma)
  require(np)
  profit <- quadgk(function(y) {
    #estimate the density
    #Here since I have estimated the weather index overall, I will look at the entire distribution
    density.pt <- npudens(bws = npudensbw(dat = as.data.frame(data.stations$weather_ind[
    which(data.stations$census_fips == census & data.stations$year < year_ext)]), 
                                      ckertype="epanechnikov", ckerorder=4), 
                      tdat = as.data.frame(data.stations$weather_ind[
                        which(data.stations$year < year_ext)]), 
                      edat = as.data.frame(y)) 
  #return the value of the expected profit
  return(ff(y)*density.pt$dens)
}, a = 0, b = strike)

##I now create a function that estimates the max
#I do this county by county, to get the best contract in each case.
#Only the density is estimated in common.
#first element of the max argument
max.arg <- sapply(-data.stations$yield[which(data.stations$census_fips == census 
                                               & data.stations$year < year_ext)] - 
                 sapply(data.stations$weather_ind[which(data.stations$census_fips == census 
                                                        & data.stations$year < year_ext)], ff), 
               function(x) x + yield_avg + profit[[1]])
#add a second column of zeroes
max.arg <- cbind(max.arg, 0)
#Take the max
max.arg <- apply(max.arg, 1, max)
#Return the final value, the sum of squares
return(sum(max.arg^2))
} 

我想将它应用于数据框的每一行。以下是第一行:

test[1:10,]
   census_fips yield_avg   strike
1        17143  161.8571 161.8571
2        17201  139.4286 139.4286
3        18003  147.4857 147.4857
4        18103  150.1571 150.1571
5        18105  137.8000 137.8000
6        18157  157.8714 157.8714
7        18163  149.5857 149.5857
8        19013  168.4286 168.4286
9        19033  163.9286 163.9286
10       19045  161.2286 161.2286

parApply中的优化是这样的:

library(foreach)
library(doParallel)
cl <- makeCluster(3) # My computer has 4 cores
registerDoParallel(cl)

clusterExport(cl=cl, varlist=c("VB04"))

tempres <- parApply(cl=cl, X=test, MARGIN=1, FUN=function(x) {
  strike <- x[3] #prepare the parameters
  yield_avg <- x[2]
  census <- x[1]
  require(optimx)
  minopt <- optimx(par=c(1,0.5), fn = VB04, lower=c(0,0), 
               upper=c(Inf,1), method="L-BFGS-B")
  return(cbind(minopt$fvalues[[1]],minopt$par[[1]])
})

使用optimx我得到错误:“无法在初始参数下评估函数” 完成任何行后,优化工作正常。它也适用于申请。 当我尝试使用optim而不是optimx时,我得到一个不同的错误:“找不到对象'''

我真的很感激任何帮助。我不确定问题是不传递参数(即使它们是在parApply中定义的),还是别的。我找不到如何解决它。

谢谢,

编辑: 忘记放置调用集群的代码并将函数传递给集群。我在上面添加了

1 个答案:

答案 0 :(得分:1)

代码中的一个问题是诸如“strike”,“yield_avg”和“census”之类的变量不在VB04的范围内,因为它们是worker函数中的局部变量。你也可以通过在该函数中定义VB04来解决这个问题。这将解决范围问题,您也不必导出VB04

这是一个可笑的精简代码版本,用于演示:

library(parallel)
cl <- makePSOCKcluster(3)
test <- matrix(1:4, 2)
tempres <- parApply(cl, test, 1, function(x) {
  VB04 <- function() {
    strike * yield_avg
  }
  strike <- x[1]
  yield_avg <- x[2]
  VB04()
})

optimoptimx提供了一种向函数传递其他参数的方法,这可能是更好的解决方案。