将仿真模型拟合到R中具有optim的数据时出错

时间:2015-04-25 23:37:23

标签: r optimization

下午好: 我有30年的guanacos人口数据和矩阵模拟模型;我想通过最小化残差平方和(Sum(obs-pred)^ 2)来估计使用optim的R中模型参数的12的最佳值。我用模型创建了一个函数。模型工作正常,因为如果我使用固定参数值调用函数,我会得到执行结果。当我使用sewt的初始参数和函数调用optim时,我得到以下错误消息:"参数a13不存在,没有遗漏值#34 ;; 还有:丢失警告消息:在if(SSQ == 0){: 条件的长度> 1,只使用第一个元素 来自:顶级(自由译自西班牙语)。

请在下面的三个部分中找到R代码:(1)首先使用功能" guafit"声明,(2)使用模型的独立运行调用" guafit",以及(3)调用" optim"及其起始参数值。

谢谢你,Jorge Rabinovich

# Section (1):
# Clean all
    rm(list=ls())

#####################################################################
####################### Function "guafit"  ##########################
#####################################################################

    guafit <- function(SSQ,a13,a21,a32,a33,ad1,ad2,ad3,ad4,bd1,bd2,bd3,bd4) {

# Create the field population (a 30-years time series)
# ====================================================
    tfield <- c(12334,10670,19078,11219,11771,12323,13027,14094,14604,17775,20774,16410,17626,21445,21111,20777,28978,27809,28935,38841,38363,32273,43128,58597,52456,33125,61334,60488,44773,56973)

# Assign values to the vector of the initial population
# =====================================================
    G <- matrix(c(0,0,0),ncol=3, nrow=1)
    G[1]= 1542
    G[2]= 740
    G[3]= 3885

# Load the matrices with their initial values for all 30 time units (years)
# =========================================================================
    if (SSQ == 0) {
    a<-array(0,dim=c(3,3,30))
    for(i in 1:29) {
    a[1,3,i]= a13
    a[2,1,i]= a21
    a[3,2,i]= a32
    a[3,3,i]= a33
            }
        }
# Initialize some  variables
# ==========================
    tmod<-array(1,dim=c(1,30)); tmod <- as.numeric(tmod)
    densprop<-array(1,dim=c(1,30)); densprop <- as.numeric(densprop)
    FdeltaFe<-array(1,dim=c(1,30)); FdeltaFe <- as.numeric(FdeltaFe)
    FdeltaSc<-array(1,dim=c(1,30)); FdeltaSc <- as.numeric(FdeltaSc)
    FdeltaSj<-array(1,dim=c(1,30)); FdeltaSj <- as.numeric(FdeltaSj)
    FdeltaSa<-array(1,dim=c(1,30)); FdeltaSa <- as.numeric(FdeltaSa)

# N0 is the initial population vector
# It is multiplied by 2 to represewnt both sexes
# ===============================================
# Transfer guanacos (G) as a vector with the three age classes
    N0 <- G
    tmod[1] <- (N0[1]+N0[2]+N0[3]) * 2

# Declaration of the initial simulation conditions
# ================================================
# ng is the number of female individuals per age class (dim 3x30)
# tmod is the total (both sexes) population (sum of the three age classes * 2)
    ng <- matrix( 0, nrow= 3, ncol=30)
    ng[,1] <- N0

# We assume a constant carrying capacity (K= 60000 individuals)
    carcap= 60000

# Start simulation for 30 years
    for(i in 1:29) {
#Set up the density-dependent coefficients

    densprop[i] <- tmod[i] / carcap

# Calculate the density-dependent factors
    FdeltaFe[i]= 1/(1+exp((densprop[i]-ad1)*bd1))
    FdeltaSc[i]= 1/(1+exp((densprop[i]-ad2)*bd2))
    FdeltaSj[i]= 1/(1+exp((densprop[i]-ad3)*bd3))
    FdeltaSa[i]= 1/(1+exp((densprop[i]-ad4)*bd4))

# Apply the density-dependent factors to each coefficient in its own age class

    a[1,3,i]= a[1,3,i] * FdeltaFe[i]
    a[2,1,i]= a[2,1,i] * FdeltaSc[i]
    a[3,2,i]= a[3,2,i] * FdeltaSj[i]
    a[3,3,i]= a[3,3,i] * FdeltaSa[i]

# Project the total population with the matrix operation

    ng[,i+1] <- a[,,i]%*%ng[,i]
    tmod[i+1] <- (ng[1,i+1] + ng[2,i+1] + ng[3,i+1]) * 2

# End of the 30-years simulation loop
            }

# Calculate the residual sum of squares (SSQ)       
            SSQ = sum((tmod - tfield)^2)
               return(list(outm=tmod, outc=tfield, elSSQ=SSQ, matrices=a,     losgua=G, losguaxe=ng))

# End of function guafit
        }
#################################################################################

# Section (2):

# Initialize conditions and parameters before calling function guafit

    SSQ <- 0

# Initialize the 8 density-dependent coefficients (2 coefficients per age class)
# ==============================================================================
    ad1= 1.195680167
    ad2= 1.127219245
    ad3= 1.113739384
    ad4= 1.320456815
    bd1= 10.21559509
    bd2= 9.80201883
    bd3= 9.760834107
    bd4= 10.59390027

# Assign initial values to the transition matrix coefficients
# ============================================================
    a21= 0.6
    a32=0.8
    a33=0.9
    a13=0.37

# Initialization of conditions is finished, we can  call function guafit
# As a test, we call function guafit only once to check for results
    myresults <- guafit(SSQ,a13,a21,a32,a33,ad1, ad2, ad3, ad4, bd1, bd2, bd3, bd4)
# We save the results of interest of function guafit with new variables 
    restmod <- myresults$outm
    tfield <- myresults$outc
    SSQvalue <- myresults$elSSQ
    lasmatrices <- myresults$matrices
    reslosgua <- myresults$losgua
    reslosguaxe <- myresults$losguaxe
    SSQvalue
# Graph the results
    axisx <- c(1982:2011)
    plot(axisx,tfield)
    lines(axisx,restmod)

#################################################################################

# Section (3):

# I try the optim function 

# First creating the initial parameter values variable to pass as an argument

    startparam <- c(SSQ, a13,a21,a32,a33,ad1, ad2, ad3, ad4, bd1, bd2, bd3, bd4)
    optim(startparam, guafit)
# and I got error message mentioned above.

# I also tried calling as:
    optim(par=startparam, fn=guafit) 
# and I got the same error message

# I also tried calling optim but passing the values directly as a list of values:
    startparam <- c(SSQ=0, a13=0.37, a21=0.6, a32=0.8, a33=0.9, ad1=1.1, ad2=1.1, ad3=1.1, ad4=1.1, bd1=10, bd2=10, bd3=10, bd4=10)
    optim(startparam, guafit)
    optim(par=startparam, fn=guafit) 

# and I got the same error message

2 个答案:

答案 0 :(得分:0)

我运行了代码,但我不确定它是否返回正确的参数估计值。我尝试了两种不同的方法。

采用第一种方法,我做了以下事情:

  1. 将所有数据和初始化放在函数之外。
  2. optim声明中删除了常量
  3. 假设ad1ad2ad3ad4是常量,而不是要估算的参数。我猜你可以说我假设ad1ad2ad3ad4是不变的偏差。
  4. 第一种方法返回了与您的初始值非常匹配的bd1bd2bd3bd4的估算值。

    采用第二种方法,我做了以下几点:

    1. 将所有数据和初始化放在函数之外。
    2. optim声明中删除了常量
    3. 假设ad1ad2ad3ad4是要估算的参数。
    4. 第二种方法返回了所有八个参数的估算值:ad1ad2ad3ad4bd1bd2,{ {1}}和bd3但我不认为估算值是正确的。请注意,Hessian中的两列都是bd4。无法使用第二种方法估算SE,0bd1bd2bd3的估算值与初始值不相符。

      两种方法的bd4代码如下。检查代码以查看是否有任何方法正在执行您想要的操作。如果将Rad1ad2ad3视为参数至关重要,则可能需要修改模型代码。

      以下是第一种方法的ad4代码和输出:

      R

      以下是第一种方法的输出:

      set.seed(2345)
      
      guafit <- function(param) {
      
                bd1 <- param[1]
                bd2 <- param[2]
                bd3 <- param[3]
                bd4 <- param[4]
      
      # Start simulation for 30 years
          for(i in 1:29) {
      
      # Set up the density-dependent coefficients
          densprop[i] <- tmod[i] / carcap
      
      # Calculate the density-dependent factors
          FdeltaFe[i]= 1/(1+exp((densprop[i]-ad1)*bd1))
          FdeltaSc[i]= 1/(1+exp((densprop[i]-ad2)*bd2))
          FdeltaSj[i]= 1/(1+exp((densprop[i]-ad3)*bd3))
          FdeltaSa[i]= 1/(1+exp((densprop[i]-ad4)*bd4))
      
      # Apply density-dependent factors to each coefficient in its own age class
          a[1,3,i]= a[1,3,i] * FdeltaFe[i]
          a[2,1,i]= a[2,1,i] * FdeltaSc[i]
          a[3,2,i]= a[3,2,i] * FdeltaSj[i]
          a[3,3,i]= a[3,3,i] * FdeltaSa[i]
      
      # Project the total population with the matrix operation
          ng[,i+1]  <- a[,,i]%*%ng[,i]
          tmod[i+1] <- (ng[1,i+1] + ng[2,i+1] + ng[3,i+1]) * 2
      
      # End of the 30-years simulation loop
                  }
      
      # Calculate the residual sum of squares (SSQ)       
                  SSQ = sum((tmod - tfield)^2)
      
      # End of function guafit
              }
      
      # Create the field population (a 30-years time series)
      # ====================================================
          tfield <- c(12334,10670,19078,11219,11771,12323,13027,14094,14604,17775,
                      20774,16410,17626,21445,21111,20777,28978,27809,28935,38841,
                      38363,32273,43128,58597,52456,33125,61334,60488,44773,56973)
      
      # Initialize conditions and parameters before calling function guafit
          SSQ <- 0
      
      # Assign initial values to the transition matrix coefficients
      # ============================================================
          a21 = 0.6
          a32 = 0.8
          a33 = 0.9
          a13 = 0.37
      
      # Assign values to the vector of the initial population
      # =====================================================
          G <- matrix(c(0,0,0),ncol=3, nrow=1)
          G[1] = 1542
          G[2] = 740
          G[3] = 3885
      
      # Load the matrices with their initial values for all 30 time units (years)
      # =========================================================================
      
          a <- array(0,dim=c(3,3,30))
      
          for(i in 1:29) {
          a[1,3,i]= a13
          a[2,1,i]= a21
          a[3,2,i]= a32
          a[3,3,i]= a33
                  }
      
      # Initialize some  variables
      # ==========================
          tmod<-array(1,dim=c(1,30)); tmod <- as.numeric(tmod)
      
          densprop <- array(1,dim=c(1,30)) 
          densprop <- as.numeric(densprop)
      
          FdeltaFe <- array(1,dim=c(1,30)) 
          FdeltaFe <- as.numeric(FdeltaFe)
      
          FdeltaSc <- array(1,dim=c(1,30))
          FdeltaSc <- as.numeric(FdeltaSc)
      
          FdeltaSj <- array(1,dim=c(1,30))
          FdeltaSj <- as.numeric(FdeltaSj)
      
          FdeltaSa <- array(1,dim=c(1,30))
          FdeltaSa <- as.numeric(FdeltaSa)
      
          ng <- matrix( 0, nrow= 3, ncol=30)
      
      # We assume a constant carrying capacity (K= 60000 individuals)
          carcap= 60000
      
      # N0 is the initial population vector
      # It is multiplied by 2 to represewnt both sexes
      # ===============================================
      # Transfer guanacos (G) as a vector with the three age classes
      
          N0 <- G
      
          tmod[1] <- (N0[1]+N0[2]+N0[3]) * 2
      
      # Declaration of the initial simulation conditions
      # ================================================
      # ng is the number of female individuals per age class (dim 3x30)
      # tmod is total (both sexes) population (sum of the three age classes * 2)
      
          ng[,1] <- N0
      
      
      ad1 <- 1.195680167
      ad2 <- 1.127219245
      ad3 <- 1.113739384
      ad4 <- 1.320456815
      
       fit <- optim ( c(10.21559509,
                         9.80201883,
                         9.760834107,
                        10.59390027), fn=guafit, method='BFGS', hessian=TRUE)
       fit
      

      以下是四个参数估计值及其标准误差:

      $par
      [1] 11.315899 11.886347 11.912239  9.675885
      
      $value
      [1] 1177381086
      
      $counts
      function gradient 
           150       17 
      
      $convergence
      [1] 0
      
      $message
      NULL
      
      $hessian
               [,1]      [,2]      [,3]      [,4]
      [1,] 417100.8  306371.2  326483.2  941186.8
      [2,] 306371.2  516636.4  370602.5 1061540.5
      [3,] 326483.2  370602.5  577929.7 1135695.9
      [4,] 941186.8 1061540.5 1135695.9 3720506.4
      

      以下是第二种方法的 mle se [1,] 11.315899 0.002439888 [2,] 11.886347 0.002240669 [3,] 11.912239 0.002153876 [4,] 9.675885 0.001042035 代码,用于估算八个参数。用第二种方法无法估计SE:

      R

      以下是第二种方法的输出:

      set.seed(2345)
      
      guafit <- function(param) {
      
                ad1 <- param[1]
                ad2 <- param[2]
                ad3 <- param[3]
                ad4 <- param[4]
                bd1 <- param[5]
                bd2 <- param[6]
                bd3 <- param[7]
                bd4 <- param[8]
      
      # Start simulation for 30 years
          for(i in 1:29) {
      
      # Set up the density-dependent coefficients
          densprop[i] <- tmod[i] / carcap
      
      # Calculate the density-dependent factors
          FdeltaFe[i]= 1/(1+exp((densprop[i]-ad1)*bd1))
          FdeltaSc[i]= 1/(1+exp((densprop[i]-ad2)*bd2))
          FdeltaSj[i]= 1/(1+exp((densprop[i]-ad3)*bd3))
          FdeltaSa[i]= 1/(1+exp((densprop[i]-ad4)*bd4))
      
      # Apply the density-dependent factors to each coefficient in its own age class
          a[1,3,i]= a[1,3,i] * FdeltaFe[i]
          a[2,1,i]= a[2,1,i] * FdeltaSc[i]
          a[3,2,i]= a[3,2,i] * FdeltaSj[i]
          a[3,3,i]= a[3,3,i] * FdeltaSa[i]
      
      # Project the total population with the matrix operation
      
          ng[,i+1]  <- a[,,i]%*%ng[,i]
          tmod[i+1] <- (ng[1,i+1] + ng[2,i+1] + ng[3,i+1]) * 2
      
      # End of the 30-years simulation loop
                  }
      
      # Calculate the residual sum of squares (SSQ)       
                  SSQ = sum((tmod - tfield)^2)
      
      # End of function guafit
              }
      
      # Create the field population (a 30-years time series)
      # ====================================================
          tfield <- c(12334,10670,19078,11219,11771,12323,13027,14094,14604,17775,
                      20774,16410,17626,21445,21111,20777,28978,27809,28935,38841,
                      38363,32273,43128,58597,52456,33125,61334,60488,44773,56973)
      
      # Initialize conditions and parameters before calling function guafit
          SSQ <- 0
      
      # Assign initial values to the transition matrix coefficients
      # ============================================================
          a21 = 0.6
          a32 = 0.8
          a33 = 0.9
          a13 = 0.37
      
      # Assign values to the vector of the initial population
      # =====================================================
          G <- matrix(c(0,0,0),ncol=3, nrow=1)
          G[1] = 1542
          G[2] = 740
          G[3] = 3885
      
      # Load the matrices with their initial values for all 30 time units (years)
      # =========================================================================
      
          a <- array(0,dim=c(3,3,30))
      
          for(i in 1:29) {
          a[1,3,i]= a13
          a[2,1,i]= a21
          a[3,2,i]= a32
          a[3,3,i]= a33
                  }
      
      # Initialize some  variables
      # ==========================
          tmod<-array(1,dim=c(1,30)); tmod <- as.numeric(tmod)
      
          densprop <- array(1,dim=c(1,30)) 
          densprop <- as.numeric(densprop)
      
          FdeltaFe <- array(1,dim=c(1,30)) 
          FdeltaFe <- as.numeric(FdeltaFe)
      
          FdeltaSc <- array(1,dim=c(1,30))
          FdeltaSc <- as.numeric(FdeltaSc)
      
          FdeltaSj <- array(1,dim=c(1,30))
          FdeltaSj <- as.numeric(FdeltaSj)
      
          FdeltaSa <- array(1,dim=c(1,30))
          FdeltaSa <- as.numeric(FdeltaSa)
      
          ng <- matrix( 0, nrow= 3, ncol=30)
      
      # We assume a constant carrying capacity (K= 60000 individuals)
          carcap= 60000
      
      # N0 is the initial population vector
      # It is multiplied by 2 to represewnt both sexes
      # ===============================================
      # Transfer guanacos (G) as a vector with the three age classes
      
          N0 <- G
      
          tmod[1] <- (N0[1]+N0[2]+N0[3]) * 2
      
      # Declaration of the initial simulation conditions
      # ================================================
      # ng is the number of female individuals per age class (dim 3x30)
      # tmod is the total (both sexes) population (sum of the three age classes * 2)
      
          ng[,1] <- N0
      
       fit <- optim ( c( 1.195680167,
                         1.127219245,
                         1.113739384,
                         1.320456815,
                        10.21559509,
                         9.80201883,
                         9.760834107,
                        10.59390027), fn=guafit, method='BFGS', hessian=TRUE)
      
       fit
      

答案 1 :(得分:0)

2015年4月27日,星期一

亲爱的马克:

万分感谢您为我的问题投入的工作和想法。

一些评论和补充:

(1)必须估算所有8个参数ad1,ad2,ad3,ad4,bd1,bd2,bd3和bd4(正如您在第二种方法中所做的那样)。

(2)但另外30周期3x3矩阵[a(3x3x30)]有四个元素(a13,a21,a32和a33)必须通过优化来估算。< / p>

(3)这就是我将矩阵a [,, i]作为参数传递的原因;也许当你删除[,, i]作为调用optim的参数时,问题就消失了。

(4)也许这是我原来的问题开始的地方;我记得曾经读过某个地方,因为参数有尺寸,优化存在问题;你认为我的错误信息与参数列表中的[,, i]作为参数有关吗?

(5)我知道我有点雄心勃勃:我要求优化以最小化我的SSQ(场和群体向量之间的残差平方和)改变总共128个参数(4x30 = 120为a [,, i]矩阵,以及(1)中提到的八个附加参数。

(6)如果问题是由参数列表中的[,, i]作为参数引起的,如果我在调用时替换参数列表中的矩阵a [,, i],你认为它会消失吗?优先列出120个单独的值?

(7)另一个重要问题:我需要在拟合完成后将模型种群向量(tmod)作为来自optim的输出变量;我插入你的脚本(方法#1)命令&#34; return(restmod = tmod)&#34;就在计算SSQ之前,我得到一条错误消息,说明&#34;在optim中的目标函数评估大小30不是1&#34;。请求此输出时我犯了错误吗?

(8)最后,还有一个非常重要的问题:我忘了提到我的参数必须是约束条件;它们非常简单:a [i]矩阵的所有非零系数(即a [1,3,i] = a13,a [2,1,i] = a21,a [3,2, i] = a32,a [3,3,i] = a33)必须<= 1,除了a13必须<= 0.5(其他8个参数不受约束)。你如何通过优化必要的约束?

再次,非常感谢,

豪尔赫