在R中编写我自己的MLE命令会导致问题

时间:2013-06-18 23:28:01

标签: r optimization max stata

我真的开始尝试在R中编写MLE命令,该函数看起来与本机R函数类似。在这次尝试中,我试图用

做一个简单的MLE

y = b0 + x * b1 + u

u~N(0,sd = s0 + z * s1)

然而,即使这么简单的命令,我也难以编码。我在Stata in a handful of lines

中写了一个类似的命令

这是我到目前为止在R.中编写的代码。

  normalreg <- function (beta, sigma=NULL, data, beta0=NULL, sigma0=NULL,
                         con1 = T, con2 = T) {

    # If a formula for sigma is not specified 
    #  assume it is the same as the formula for the beta.
    if (is.null(sigma)) sigma=beta

    # Grab the call expression
    mf <- match.call(expand.dots = FALSE)

    # Find the position of each argument
    m <- match(c("beta", "sigma", "data", "subset", "weights", "na.action", 
                 "offset"), names(mf), 0L)

    # Adjust names of mf
    mf <- mf[c(1L, m)]

    # Since I have two formulas I will call them both formula
    names(mf)[2:3] <- "formula"

    # Drop unused levels
    mf$drop.unused.levels <- TRUE

    # Divide mf into data1 and data2
    data1  <- data2 <- mf
     data1 <- mf[-3]
     data2 <- mf[-2]

    # Name the first elements model.frame which will be 
    data1[[1L]] <- data2[[1L]] <- as.name("model.frame")

    data1 <- as.matrix(eval(data1, parent.frame()))
    data2 <- as.matrix(eval(data2, parent.frame()))

    y     <- data1[,1]
    data1 <- data1[,-1]
     if (con1)  data1 <- cbind(data1,1)
    data2 <- unlist(data2[,-1])
      if (con2) data2 <- cbind(data2,1)

    data1 <- as.matrix(data1) # Ensure our data is read as matrix
    data2 <- as.matrix(data2) # Ensure our data is read as matrix

    if (!is.null(beta0)) if (length(beta0)!=ncol(data1))
      stop("Length of beta0 need equal the number of ind. data2iables in the first equation")

    if (!is.null(sigma0)) if (length(sigma0)!=ncol(data2)) 
      stop("Length of beta0 need equal the number of ind. data2iables in the second equation")

    # Set initial parameter estimates
    if (is.null(beta0))  beta0   <- rep(1, ncol(data1))
    if (is.null(sigma0)) sigma0 <- rep(1, ncol(data2))

    # Define the maximization function
    normMLE <- function(est=c(beta0,sigma0), data1=data1, data2=data2, y=y) {          
      data1est <- as.matrix(est[1:ncol(data1)], nrow=ncol(data1))
      data2est <- as.matrix(est[(ncol(data1)+1):(ncol(data1)+ncol(data2))],
                              nrow=ncol(data1))

      ps <-pnorm(y-data1%*%data1est, 
                       sd=data2%*%data2est)
      # Estimate a vector of log likelihoods based on coefficient estimates
      llk <- log(ps)
      -sum(llk) 
    }

    results <- optim(c(beta0,sigma0), normMLE, hessian=T,
                     data1=data1, data2=data2, y=y)

    results
  }


  x <-rnorm(10000)
  z<-x^2
  y <-x*2 + rnorm(10000, sd=2+z*2) + 10

  normalreg(y~x, y~z)

此时最大的问题是找到一个优化例程,当标准偏差为负值时,某些值返回NA时不会失败。有什么建议?很抱歉代码很多。

Francis

1 个答案:

答案 0 :(得分:2)

我检查是否有任何标准偏差小于或等于0,如果是这样,则返回0的可能性。似乎为我工作。你可以弄清楚将它包装到你的函数中的细节。

#y=b0 + x*b1 + u
#u~N(0,sd=s0 + z*s1)

ll <- function(par, x, z, y){
    b0 <- par[1]
    b1 <- par[2]
    s0 <- par[3]
    s1 <- par[4]
    sds <- s0 + z*s1
    if(any(sds <= 0)){
        return(log(0))
    }

    preds <- b0 + x*b1

    sum(dnorm(y, preds, sds, log = TRUE))
}

n <- 100
b0 <- 10
b1 <- 2
s0 <- 2
s1 <- 2
x <- rnorm(n)
z <- x^2
y <- b0 + b1*x + rnorm(n, sd = s0 + s1*z)

optim(c(1,1,1,1), ll, x=x, z=z,y=y, control = list(fnscale = -1))

有了这样说,以这样的方式参数化标准偏差可能不是一个坏主意,以至于不可能消极......