R中的nlmib和optim函数有什么区别? 我应该先使用哪一个? 哪一个更好!?甚至更快?甚至更准确? 我应该相信哪一个?
您可以找到我将要最大化的完整数据和可能性函数。具有相同参数的这两个函数的答案是完全不同的,我应该相信哪一个?
data <- c(59446625, 184078130, 26296875, 130769230, 110981380, 26113266,
32641583, 32641583, 39169899, 78339798, 58618421, 130263160,
195394740, 194341800, 64780600, 58302540, 45346420, 19434180,
343337180, 19359663, 806652610, 19359663, 204620060, 19183131,
25365486, 12682743, 18910112, 63033708, 25156951, 44024664, 503139010,
50313901, 31259287, 12485163, 62425816, 81153561, 149600000,
49866667, 37400000, 105966670, 261800000, 87266667, 31120562,
74689349, 99585799, 124482250, 41079142, 454360210, 74469027,
62057522, 105497790, 105497790, 99292035, 31028761, 161349560,
37125000, 123750000, 92812500, 30937500, 92676211, 763879940,
42871179, 2082314400, 30622271, 24497817, 73493450, 54960087,
36613488, 42500000, 12142857, 90613783, 66450108, 736992100,
30204594, 331774190, 361935480, 72387097, 916903230, 9.35e+08,
66259843, 60236220, 36012839, 258092010, 684243940, 264094150,
30010699, 48017118, 35935943, 47914591, 179679720, 23957295,
18514194000, 23821656, 83375796, 1905732500, 118688290, 29672073,
17803244, 41540903, 29630282, 53334507, 503714790, 41511628,
770930230, 35406732, 147528050, 206539270, 88207547, 599811320,
117200560, 2051009700, 146500700, 41020195, 75968750, 70125000,
99343750, 52593750, 46685160, 192576280, 17506935, 174826870,
256412740, 116551250, 64103186, 145689060, 553618420, 17482687,
17482687, 52448061, 52303177, 34868785, 23245856, 34868785, 28997243,
40596141, 23197795, 34653397, 69306795, 404289640, 432870370,
115116280, 201453490, 14389535000, 920930230, 149651160, 521993870,
85750679, 280118890, 28544776, 28544776, 79925373, 856343280,
11410169, 22820339, 11410169, 39800676, 102344590, 51172297,
45363881, 107739220, 56704852, 90727763, 22681941, 39533557,
90362416, 67771812, 112953020, 45181208, 33795181, 253294310,
22515050, 56287625, 197006690, 22515050, 28106212, 67454910,
123667330, 151773550, 263143710, 263143710, 27994012, 94488111,
422417440, 166743730, 38778802, 66477946, 105256750, 94177090,
138495720, 88637261, 470885450, 1255062400, 398081470, 99520368,
204569650, 22115637, 60818003, 110578190, 165540980, 27590164,
82770492, 110360660, 77252459, 412769780, 33021583, 961243470,
38449739, 2299479500, 82124268, 43828125, 191748050, 263140070,
654015540, 98102332, 430560230, 119902850, 801169040, 86920594,
70622983, 118901730, 226994220, 86473988, 86473988, 150748560,
328416510, 75374280, 322413790, 59109195, 161206900, 145086210,
403017240, 42961072, 26850670, 37590938, 75181876, 144716560,
117917200, 42878981, 21398601, 171188810, 10665399, 1706463900,
159980990, 69106128, 180739100, 143528110, 90369551, 26529004,
21223203, 47752207, 47752207, 31834805, 291819040, 42313011,
105451130, 42180451, 316353380, 199856250, 42075000, 42075000,
115706250, 168089890, 36769663, 236376400, 147078650, 57780899,
84097439, 99865709, 115633980, 83992514, 183504670, 62915888,
78644860, 83731343, 125597010, 171840970, 41684211, 46952883,
208292080, 181917850, 77964793, 88360099, 36383570, 103953060,
36383570, 72632552, 31128237, 207521580, 155353850, 315886150,
31013514, 217094590, 46520270, 1390439200, 196418920, 98209459,
671959460, 221990800, 180690180, 51625767, 309754600, 149714720,
216828220, 413006130, 92812500, 77249082, 66949204, 154498160,
370795590, 267469440, 113160150, 3039893000, 112884150, 174457320,
205369130, 773381010, 588998780, 169017040, 307303710, 5.1e+07,
192400720, 303790610, 75947653, 394927800, 151895310, 1503763500,
197463900, 86074007, 86074007, 227842960, 71897112, 192400720,
55694946, 136295740, 166583680, 161535690, 206967610, 30215440,
60430880, 1964669400, 100059450, 1.65e+08)
f.like <- function(p, x)
{
e <- p[3] * p[1] * p[2]^p[1] * x^(p[3] - 1) * (p[2] + x^p[3])^(-(p[1] + 1))
-sum(log(e))
}
> nlminb(c(1, 2, 3), f.like, x = data)
$par
[1] 7.458577e+01 1.386598e+04 2.881954e-01
$objective
[1] -Inf
$convergence
[1] 0
$iterations
[1] 39
$evaluations
function gradient
65 140
$message
[1] "X-convergence (3)"
There were 12 warnings (use warnings() to see them)
> optim(c(1, 2, 3), f.like, x = data)
$par
[1] 58.2277186 2902.3301013 0.2398371
$value
[1] 7218.866
$counts
function gradient
502 NA
$convergence
[1] 1
$message
NULL
There were 15 warnings (use warnings() to see them)
答案 0 :(得分:1)
出于计算稳定性的考虑,建议在表达式内使用 log :
loglik <- function(p, x){
e <- log(p[3]) + log(p[1]) + log(p[2])*p[1] +
log(x)*(p[3] - 1) + log(p[2] + x^p[3])*(-(p[1] + 1))
-sum(e)
}
all.equal(f.like(p=c(1,2,3), x=data),
loglik(p=c(1,2,3), x=data))
[1] TRUE
所有参数p
必须为正。否则,可能性将返回-Inf
或NaN
。在优化过程中,应通过指定下界。
我在method="L-BFGS-B"
的{{1}}方面取得了良好的经验。
optim()
我们可以将其与optim(par=c(1, 2, 3), fn=loglik, x=data,
method="L-BFGS-B", lower=rep(.01, 3))
$par
[1] 3.293925e+00 3.889408e+05 6.305405e-01
$value
[1] 6928.781
$counts
function gradient
133 133
$convergence
[1] 0
$message
[1] "CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH"
进行比较:
nlminb()
这两种方法都找到最小值 nlminb(start=c(1, 2, 3), objective=loglik,
x=data, lower=rep(.01, 3))
$par
[1] 2.653579e+03 6.117484e+05 2.989276e-01
$objective
[1] 7090.686
$convergence
[1] 1
$iterations
[1] 64
$evaluations
function gradient
102 247
$message
[1] "singular convergence (7)"
,即靠近域边界。这可能会导致收敛问题。