Question

我对数据系列有一些问题，因为它有一些零值，所以有一些发行版不适合它。我尝试使用函数fitdist和fitdistr但没有人工作。有我的数据：

当我尝试拟合分布时，例如Weibull，这是显示的消息：

> fw=fitdist(Precp,"weibull")
[1] "Error in optim(par = vstart, fn = fnobj, fix.arg = fix.arg, obs = data,  : \n  non-finite value supplied by optim\n"
attr(,"class")
[1] "try-error"
attr(,"condition")
<simpleError in optim(par = vstart, fn = fnobj, fix.arg = fix.arg, obs = data,     ddistnam = ddistname, hessian = TRUE, method = meth, lower = lower,     upper = upper, ...): non-finite value supplied by optim>
Error in fitdist(Precp, "weibull") : 
  the function mle failed to estimate the parameters, 
                with the error code 100

当我尝试使用伽玛分布时会发生同样的事情。知道那里发生了什么吗？

Answer 1

如果您想要适合极值分布，例如Weibull分布，您可以尝试evd包：

library(evd)
> fit <- fgev(dat$Precp)
> fit

Call: fgev(x = dat$Precp) 
Deviance: 2159.363 

Estimates
     loc     scale     shape  
151.9567  137.6544   -0.1518  

Standard Errors
     loc     scale     shape  
12.41071   9.24535   0.07124  

Optimization Information
  Convergence: successful 
  Function Evaluations: 27 
  Gradient Evaluations: 15

如果您对参数分布不感兴趣，可以考虑计算核密度的density函数。

由于您的数据似乎包含许多小值，因此您可以考虑混合两个分布。 flexmix包可以为您做到这一点。

hist(dat$Precp,prob=T,col="gray", ylim=c(0,0.0042), breaks=seq(0,700, by=50)
    xlab="", ylab="", main="")
legend("topright", 
    legend=c("density", "fgev", "flexmix"), 
    fill=c("darkgreen", "blue", "darkred")
)
xval <- seq(from=0, to=max(dat$Precp), length.out=200)

# density
fit1 <- density(dat$Precp)
lines(fit1, col="darkgreen", lwd=2)

# generalized extreme value distribution
fit2 <- fgev(dat$Precp)
param2 <- fit2$estimate
loc <- param2[["loc"]]
scal <- param2[["scale"]]
shape <- param2[["shape"]]
lines(xval, dgev(xval, loc=loc, scale=scal, shape=shape), col="blue", lwd=2)

# mixture of two Gamma distributions
# http://r.789695.n4.nabble.com/Gamma-mixture-models-with-flexmix-tt3328929.html#none
fit3 <- flexmix(Precp~1, data=subset(dat, Precp>0), k=2, 
    model = list(FLXMRglm(family = "Gamma"), FLXMRglm(family = "Gamma"))
)
param3 <- parameters(fit3)[[1]] # don't know why this is a list
interc <- param3[1,]
shape <- param3[2,]
lambda <- prior(fit3)
yval <- lambda[[1]]*dgamma(xval, shape=shape[[1]], rate=interc[[1]]*shape[[1]]) + 
        lambda[[2]]*dgamma(xval, shape=shape[[2]], rate=interc[[2]]*shape[[2]])
lines(xval, yval, col="darkred", lwd=2)

device output

数据系列，我如何在R中拟合分布？

1 个答案: