Numpy lomax平均函数返回“ inf”而不是值

时间:2018-12-11 21:56:21

标签: python python-3.x numpy scipy

this tutorial之后,我创建了以下churn.py文件:

import numpy as np
import scipy as sp
import scipy.stats as stats

#duration of alive subscriptions
censored = np.array([419,513, ... ,316,14])
#duration of completed subscriptions
uncensored = np.array([389,123,340, ... ,56,31])

#Log likelihoods for censored data
def log_likelihood_lomax(args):
    shape, scale = args
    val = stats.lomax.logpdf(uncensored, shape, loc=0, scale=scale).sum() + stats.lomax.logsf(censored, shape, loc=0, scale=scale).sum()
    return -val

res_lomax = sp.optimize.minimize(log_likelihood_lomax,   [1, 1], bounds=((0.001, 1000000), (0.001, 1000000)))

print("lomax shape", res_lomax.x[0], ", scale=", res_lomax.x[1])
print("lomax mean", stats.lomax.mean(res_lomax.x[0], scale=res_lomax.x[1]))
print("lomax median", stats.lomax.median(res_lomax.x[0], scale=res_lomax.x[1]))

注意 ...censored数组中的uncensored出于保密目的。在实际的脚本中,我改为包含实数值。

当我使用python3 churn.py运行此脚本时,得到以下结果:

lomax shape 0.36948878639375643 , scale= 1440.4384891101636
lomax mean inf
lomax median 7961.447172364986

我知道一个事实,即返回的中位数值不正确。

但最重要的是,我不明白为什么lomar平均值返回inf

我的脚本有什么问题吗?

1 个答案:

答案 0 :(得分:2)

您的结果显示

lomax shape 0.36948878639375643 

也就是说,使用scipy表示法,形状参数c为0.36948878639375643(在wikipedia article中,c为α)。 对于c≤1,分布的均值是无限的(即,定义均值的积分)。

您问“我的脚本有什么问题吗?” 我建议进行一项重要更改:在调用minimize之后,请在检查res_lomax.success为True之前,您使用res_lomax.x中的值。像这样:

res_lomax = sp.optimize.minimize(log_likelihood_lomax, [1, 1], bounds=((0.001, 1000000), (0.001, 1000000)))
if res_lomax.success:
    print("lomax shape", res_lomax.x[0], ", scale=", res_lomax.x[1])
    print("lomax mean", stats.lomax.mean(res_lomax.x[0], scale=res_lomax.x[1]))
    print("lomax median", stats.lomax.median(res_lomax.x[0], scale=res_lomax.x[1]))
else:
    print("minimization failed:", res_lomax.message)