我正在努力使我的数据适合负二项模型。我的数据只是一个数字数组
all_hits = [0, 4000, 200, ...]
如果我将数据绘制为直方图并使用眼睛选择的某些参数绘制负二项式函数,我会得到以下结果:
import matplotlib.pyplot as plt
import scipy.stats as ss
import scipy.optimize as so
import numpy as np
plt.plot(range(0,30000), ss.nbinom.pmf(range(0,30000), n=3, p=1.0/300, loc=0), 'g-')
bins = plt.hist(all_hits, 100, normed=True, alpha=0.8)
但是,当我尝试使用scipy.optimize.fmin
估算n,p和loc时,我得到无穷大的负对数似然性
# function from http://stackoverflow.com/questions/23816522/fitting-negative-binomial-in-python
def likelihood_f((n,p,loc), x, neg=-1):
n=np.round(n) #by definition, it should be an integer
loc=np.round(loc)
return neg*(np.log(ss.nbinom.pmf(x, n, p, loc))).sum()
xopt, fopt, iterations, funcalls, warn = so.fmin(likelihood_f, (3,1.0/300, 0), args=(all_hits,-1), full_output=True, disp=False)
print 'optimal solution: r=%f, p=%f, loc=%f' % tuple(xopt)
print 'log likelihood = %f' % (fopt)
print 'ran %d iterations with %d function calls' % (iterations, funcalls)
print 'warning code %d' % (warn)
optimal solution: r=3.000000, p=0.003333, loc=0.000000
log likelihood = inf
ran 121 iterations with 604 function calls
warning code 1
是否有其他方法来估算负二项分布的参数?我的似然函数或我的初始猜测是否有问题会使估计值永远不会超过猜测值?