具有特定平均值的截断指数分布的点的样本向量

时间:2018-11-06 11:12:16

标签: python random scipy statistics distribution

我创建了一个截断的指数分布:

from scipy.stats import truncexpon
truncexp = truncexpon(b = 8)

现在,我想从此分布中采样8个点,以使其均值约为4。 最好的方法是什么,而不会造成巨大的循环来随机采样直到均值足够接近?

2 个答案:

答案 0 :(得分:0)

平均值是您分布的特征。如果继续采样值,经验均值将越来越接近分析均值。

Scipy可以告诉您截断指数的平均值:

b = 8
truncexp = truncexpon(b)
truncexp.mean() # 0.99731539839326999

您可以使用分布来采样并计算经验均值:

num_samples = 100000
np.mean(truncexp.rvs(num_samples)) # 0.99465816346645264

一个计算公式的平均值是(第二行):

b = np.linspace(0.1, 20, 100)
m = 1/ ((1 - np.exp(-b)) / ((1 - (b + 1)*np.exp(-b))))

如果对此进行绘制,则可以看到平均值对不同b值的表现。

enter image description here

对于b-> inf,均值将接近1。您将找不到均值为4的b。

如果要从平均值为4的截断指数中采样,则可以简单地缩放采样。这不会给您原始分布的样本,但是再次,原始分布的样本将永远不会给您平均值4。

truncexp.rvs(num_samples) * 4 / truncexp.mean()

答案 1 :(得分:0)

truncexpon分布具有三个参数:形状b,位置loc和比例尺scale。发行版的支持为[x1, x2],其中x1 = locx2 = shape*scale + loc。对shape求解后一个方程,得到shape = (x2 - x1)/scale。我们将选择scale参数,以使分布的均值为4。为此,我们可以将scipy.optimize.fsolve应用于当truncexpon.mean((x2 - x1)/scale, loc, scale)为4时标度为零的函数

这是一个简短的脚本来演示:

import numpy as np
from scipy.optimize import fsolve
from scipy.stats import truncexpon


def func(scale, desired_mean, x1, x2):
    return truncexpon.mean((x2 - x1)/scale, loc=x1, scale=scale) - desired_mean


x1 = 1
x2 = 9

desired_mean = 4.0

# Numerically solve for the scale parameter of the truncexpon distribution
# with support [x1, x2] for which the expected mean is desired_mean.
scale_guess = 2.0
scale = fsolve(func, scale_guess, args=(desired_mean, x1, x2))[0]

# This is the shape parameter of the desired truncexpon distribution.
shape = (x2 - x1)/scale

print("Expected mean of the distribution is %6.3f" %
      (truncexpon.mean(shape, loc=x1, scale=scale),))
print("Expected standard deviation of the distribution is %6.3f" %
      (truncexpon.std(shape, loc=x1, scale=scale),))

# Generate a sample of size 8, and compute its mean.
sample = truncexpon.rvs(shape, loc=x1, scale=scale, size=8)
print("Mean of the sample of size %d is %6.3f" %
      (len(sample), sample.mean(),))

bigsample = truncexpon.rvs(shape, loc=x1, scale=scale, size=100000)
print("Mean of the sample of size %d is %6.3f" %
      (len(bigsample), bigsample.mean(),))

典型输出:

Expected mean of the distribution is  4.000
Expected standard deviation of the distribution is  2.178
Mean of the sample of size 8 is  4.694
Mean of the sample of size 100000 is  4.002