
时间:2015-09-14 14:39:04

标签: r


x <- rgamma(100,shape=5,rate=5)


min <- function(data, par) {
    with(data, 1/par[2]*sum(x)-(par[1]-1)*sum(log(x))+n*(log(gamma(par[1]))+par[1]*log(par[2])))

mle <- optim(par=c(0,1),min,method='BFGS', hessian=TRUE)

AV <- 1 / mle$hessian


Error in eval(substitute(expr), data, enclos = parent.frame()) : 
    numeric 'envir' arg not of length one 




AV <- (fitdistr(x, "gamma", start=list(shape=1, rate=1))$sd)^2


2 个答案:

答案 0 :(得分:5)

Ok, let's go by parts.

First of all, your expression for the minus log likelihood is incorrect, as it is actually parameterized as a function of the shape and scale parameters. Since you sampled the data by defining the shape and rate parameters, it is simpler to maintain consistency. See https://en.wikipedia.org/wiki/Gamma_distribution

Second, the specification of the function for optim is incorrect; the documentation explicitly mentions that in the function "first argument the vector of parameters over which minimization is to take place". In the code below, I left the data as a global variable that is accessed by the function to be minimized.

Third, in such a scenario it is advisable to enforce the constraints on the two parameters being estimated, otherwise the fitting may fail in some cases, depending on the sampled data.

Finally, the calculation of the asymptotic variance is incorrect: you need to invert the Fisher's information matrix; see, e.g., https://stats.stackexchange.com/questions/68080/basic-question-about-fisher-information-matrix-and-relationship-to-hessian-and-s

Here is the code:

# Data
x <- rgamma(100, shape = 5, rate = 5)

# Minus log likelihood function
minus_log_L_fun <- function(params) {
    a <- params[1] # shape
    b <- params[2] # rate (1 / scale)

    n <- length(x) # sample size

    log_L <- (a - 1) * sum(log(x)) - b * sum(x) + n * a * log(b) - n * log(gamma(a))    
    return (-log_L)

# Impose constraints on the estimates of the shape and rate parameters: both being strictly positive
# Use algorithm 'L-BFGS-B', since it allows for box constraints
mle <- optim(par = c(1, 1), minus_log_L_fun, method = "L-BFGS-B", lower = c(1e-10, 1e-10), upper = c(100, 100), hessian = TRUE)

# Retrieve the point estimates
shape_fit <- mle$par[1]
rate_fit <- mle$par[2]

# Fisher Information Matrix (equal to the Hessian, since minimizing minus log likelihood)
I <- mle$hessian
# Obtain the asymptotic variances (need to invert the FIM)
var_Theta <- diag(solve(I))

cat(sprintf("Point estimates: shape = %g, rate = %g\n", shape_fit,     rate_fit))
cat(sprintf("Asymptotic ML variances: shape = %g, rate = %g\n", var_Theta[1], var_Theta[2]))

producing, for a particular run:

Point estimates: shape = 4.25661, rate = 4.08384
Asymptotic ML variances: shape = 0.336318, rate = 0.34875

Using the MASS package to confirm:


res <- fitdistr(x, "gamma", start = list(shape = 1, rate = 1))
MASS_PE <- res$estimate
MASS_AV <- (res$sd)^2

cat(sprintf("From the MASS package:\n"))
cat(sprintf("Point estimates:\n"))
cat(sprintf("Asymptotic variances:\n"))

leads to:

From the MASS package:
Point estimates:
   shape     rate 
4.256613 4.083836 
Asymptotic variances:
> print(MASS_AV)
    shape      rate 
0.3363179 0.3487502

答案 1 :(得分:2)


x <- rgamma(100,shape=5,rate=5)
m1 <- mle2(x~dgamma(shape,rate=rate),
           start=list(shape=2,rate=2), ## anything reasonable
           data=data.frame(x)  ## data must be specified as a data frame


m2 <- update(m1,lower=c(0,0),

现在我们可以轻松得到点估计和渐近方差 - 协方差矩阵:



  • bbmle::mle2stats4::mle的扩展,也应该适用于此问题(mle2有一些额外的花里胡哨,并且更加健壮),尽管你会必须将对数似然函数定义为:
 nLL <- function(shape,rate) {
  • 一般情况下,当dgamma()适用于您的问题时,使用内置分发功能(例如http://www.example.com/file.xxx)是一个好主意;它们经过了充分测试,功能强大,并且使代码更易于阅读。