我希望找到使用R< fitdistr函数(MLE)截断的分布的weibull形状和比例参数。使用树木直径数据样本(最小值为2.8):
data<-c(42.7,18.8,30.0,20.3,32.5,18.8,16.0,42.9,18.8,17.3,21.1,23.4,15.0,16.8,15.2,15.0,14.7,17.3,20.1,18.3,16.0,15.7,21.3,
19.1,17.3,17.0,17.3,17.5,21.6,15.7,12.7,13.2,3.6,3.6,3.6,3.6,3.6,3.6,3.6,3.6,3.6,3.6,3.6,3.6,2.8,2.8,2.8,2.8,2.8,
2.8,2.8,2.8,2.8,2.8,2.8,2.8,12.2,12.2,12.2,12.2,12.2,12.2,12.2,12.2,12.2,12.2,12.2,12.2,4.3,4.3,4.3,4.3,4.3,4.3,
4.3,4.3,4.3,4.3,4.3,4.3,2.8,2.8,2.8,2.8,2.8,2.8,2.8,2.8,2.8,2.8,2.8,2.8,5.6,5.6,5.6,5.6,5.6,5.6,5.6,5.6,5.6,5.6,
5.6,5.6,18.0,16.3,34.8,17.5,6.1,6.1,6.1,6.1,6.1,6.1,6.1,6.1,6.1,6.1,6.1,6.1,6.3,6.3,6.3,6.3,6.3,6.3,6.3,6.3,6.3,
6.3,6.3,6.3,9.4,9.4,9.4,9.4,9.4,9.4,9.4,9.4,9.4,9.4,9.4,9.4,3.8,3.8,3.8,3.8,3.8,3.8,3.8,3.8,3.8,3.8,3.8,3.8)
library(MASS)
wb<-fitdistr(data,'weibull',lower=0.1) # MLE for weibull parameter determination
wb
结果:
形状刻度
1.36605920 9.97356797
给定数据分布,可以预期负单调曲线(例如形状<1)。然而,这些结果表明形状> 1,因为fitdistr没有考虑数据被截断的事实。在其他地方,建议如下:
ltwei<-function(x,shape,scale=1,log=FALSE){dweibull(x,shape,scale,log)/pweibull(1,shape,scale,lower=FALSE) }
ltweifit<-fitdistr(data,ltwei,start=list(shape=1,scale=10))
ltweifit
但这会导致更大的威布尔形状值。如何为将数据的左截断考虑在内的分布生成形状和比例参数? 非常感谢提前。
答案 0 :(得分:1)
这是对给出的建议的修改,因为我认为分母是不正确的。试图将分母更改为1-pweibull(trunc, ...)
看看这是否会产生更好的拟合:
> ptwei <- function(x, shape, scale , log = FALSE)
+ pweibull(x, shape, scale, log)/(1-
+ pweibull(2.8, shape, scale, lower=FALSE))
> dtwei <- function(x, shape, scale , log = FALSE)
+ dweibull(x, shape, scale, log)/(1-
+ pweibull(2.8, shape, scale, lower=FALSE))
> fitdist(data, dtwei, start=list(shape=1.2, scale=4))
Fitting of the distribution ' twei ' by maximum likelihood
Parameters:
estimate Std. Error
shape 0.4555414 0.02660275
scale 0.8751571 0.12236256
我现在意识到Ripley教授正在使用较低的参数来实现我上面所做的事情,所以只要使用ltwei
调用lower = FALSE
函数,原始代码就会起作用
这是遵循Brian Ripley教授在2008年10月7日发布的Rhelp上发表的建议:
ltwei <- function(x, shape, scale = 1, log = FALSE)
dweibull(x, shape, scale, log)/
pweibull(2.8, shape, scale, lower=FALSE)
通过在截断点除以CMF来标准化威布尔密度。 (使用lower = FALSE时,pweibull函数会将截断点的积分密度返回到Inf
。)
library(MASS)
wb<-fitdistr(data,ltwei,start=list(shape=1,scale=1) )
There were 50 or more warnings (use warnings() to see the first 50)
> wb
shape scale
1.81253163 12.72912552
( 0.07199877) ( 0.54855731)
> warnings()[1:5]
Warning messages:
1: In dweibull(x, shape, scale, log) : NaNs produced
2: In pweibull(2.8, shape, scale, lower = FALSE) : NaNs produced
3: In dweibull(x, shape, scale, log) : NaNs produced
4: In pweibull(2.8, shape, scale, lower = FALSE) : NaNs produced
5: In dweibull(x, shape, scale, log) : NaNs produced
> library(MASS)
> wb<-fitdistr(data,ltwei,start=list(shape=1.1,scale=10) )
> wb
shape scale
1.81253113 12.72912870
( 0.07199877) ( 0.54855782)
您可以看到一些起始值生成警告,但算法仍然成功。如果您从更接近“真实值”的值开始,则不会出现警告。我担心你的期望没有得到满足。有时对Weibull分布的参数化存在困惑,因为关于它们的处理方式有不同的约定。这就是Terry Therneau的生存::幸存帮助页面:
# There are multiple ways to parameterize a Weibull distribution. The survreg
# function imbeds it in a general location-scale familiy, which is a
# different parameterization than the rweibull function, and often leads
# to confusion.
# survreg's scale = 1/(rweibull shape)
# survreg's intercept = log(rweibull scale)
# For the log-likelihood all parameterizations lead to the same value.
由于您似乎对fitdistr
的结果不满意,我还从'fitdistrplus'包中运行了fitdist
并获得了相同的答案。我仍然认为您需要检查参数的解释,并且可能还需要检验这必然是Weibull分布的数据:
dtwei <- function(x, shape, scale , log = FALSE)
dweibull(x, shape, scale, log)/
pweibull(2.8, shape, scale, lower=FALSE)
ptwei <- function(x, shape, scale , log = FALSE)
pweibull(x, shape, scale, log)/
pweibull(2.8, shape, scale, lower=FALSE)
fitdist(data, dtwei, start=list(shape=1.2, scale=4))
#----------------
Fitting of the distribution ' twei ' by maximum likelihood
Parameters:
estimate Std. Error
shape 1.812706 0.07199987
scale 12.728087 0.54839034