I may just be being brain dead, but I have code which defines a lognormal class for manipulation in Python using Scipy.stats.
My class looks like this:
class LogNorm :
def __init__(self, mean, sd, offset=0) :
# uses 'real' units -i.e., 100000, 15000 for mean and sdn
self.mean = mean
self.sd = sd
self.offset = offset
self.xvar_mu, self.xvar_sigma = get_base_mu_and_sigma(mean, sd)
self.mean = np.exp(self.xvar_mu + self.xvar_sigma**2.0/2.0) # reflect that change in the Y
self.sd = ((np.exp(self.xvar_sigma**2.0) - 1.0) *
(np.exp(2.0 * self.xvar_mu + self.xvar_sigma**2.0))) ** 0.5
self.RV = lognorm(s = self.xvar_sigma, scale = self.mean, loc = self.offset) # fozen
The idea here is you pass in the mean and sd, as measured, of the lognormal itself. I record those for posterity (assume offset = 0.0, as per the default). Then I have a helper function which maps those into the mu and sigma of the normal distribution that underlies the lognormal. That function looks like so, if useful:
def get_base_mu_and_sigma(mean, sd) :
mu = math.log(mean**2.0 / (sd**2.0 + mean**2.0)**0.5)
sigma = (math.log(1.0 + sd**2.0/mean**2.0))**0.5
return (mu, sigma)
This comes straight from Wikipedia, and seems right (check out the end of the 'Arithmetic Moments' section): https://en.wikipedia.org/wiki/Log-normal_distribution
Then, 'self.RV' becomes a 'frozen' RV, and has a bunch of builtin/inherited functions (mean(), median(), var(), etc) related to the lognormal described by mu and sigma.
The challenge I have is that when I create such an object, and then try examining the mean and sd (via square root of the variance), the numbers don't seem to match. For example, using mean = 110517.09 and sd = 2210.34 (from my application), when I then execute the following code I get inconsistent answers:
p = rv1.RV.pdf(x)
print("rv1.mean, rv1.sd = " + str(rv1.mean) + " " + str(rv1.sd))
print("rv1.mean(), rv1.var()**0.5 = " + str(rv1.RV.mean()) + " " + str(rv1.RV.var()**0.5))
gives:
rv1.mean, rv1.sd = 110517.09180756475 2210.341836151173
rv1.mean(), rv1.var()**0.5 = 110539.19301602637 2210.783860320406
Any clue what I am doing wrong?
答案 0 :(得分:2)
您使用self.mean
作为scale
的{{1}}参数,但这不正确。引自docstring for lognorm
:
对数正态随机变量
scipy.stats.lognorm
的常见参数化是根据唯一正态分布随机变量{{1}的均值Y
和标准差mu
计算的。这样exp(X)= Y.这个参数化对应于设置sigma
和X
。
因此,当您通过调用s = sigma
创建scale = exp(mu)
时,self.RV
参数必须为lognorm
。 (这假定scale
为0。)
这是一个使用函数np.exp(self.xvar_mu)
转换参数的简化示例。
首先,这是您的函数,以及您使用的示例值:
offset
获取基础正态分布的参数:
get_base_mu_and_sigma
创建scipy In [154]: def get_base_mu_and_sigma(mean, sd) :
...: mu = math.log(mean**2.0 / (sd**2.0 + mean**2.0)**0.5)
...: sigma = (math.log(1.0 + sd**2.0/mean**2.0))**0.5
...: return (mu, sigma)
...:
In [155]: mean = 110517.09180756475
In [156]: sd = 2210.341836151173
发布的实例,并验证该发布的平均值和标准差与In [157]: mu, sigma = get_base_mu_and_sigma(mean, sd)
和lognorm
匹配:
mean