抽样分布正态近似拟合

时间:2018-08-07 17:44:40

标签: python probability probability-theory probability-distribution bernoulli-probability

我试图使用Python模拟“样本比例的样本分布”。我在示例here

中尝试了伯努利变量

问题在于,在大量的口香糖中,我们有黄色球的真实比例为0.6。如果我们取样(一定大小,例如10个),取其平均值并作图,我们应该得到正态分布。

我设法获得了正常的采样分布,但是,具有相同的mu和sigma的实际法线连续曲线根本不拟合,但放大了几个因素。我不确定是什么原因造成的,理想情况下它是否应该完美地适合。下面是我的代码和输出。我尝试改变幅度和sigma(除以sqrt(samplesize)),但没有任何帮助。请帮助。

代码:

from SDSP import create_bernoulli_population, get_frequency_df
from random import shuffle, choices
from bi_to_nor_demo import get_metrics, bare_minimal_plot
import matplotlib.pyplot as plt


N = 10000  # 10000 balls
p = 0.6    # probability of yellow ball is 0.6, and others (1-0.6)=>0.4
n_pickups = 10       # sample size
n_experiments = 2000  # I dont know what this is called 


# STATISTICAL PDF
# choose sample, take mean and add to X_mean_list. Do this for n_experiments times. 
X_hat = []
X_mean_list = []
for each_experiment in range(n_experiments):
    X_hat = choices(population, k=n_pickups)  # choose, say 10 samples from population (with replacement)
    X_mean = sum(X_hat)/len(X_hat)
    X_mean_list.append(X_mean)
stats_df = get_frequency_df(X_mean_list)


# plot both theoretical and statistical outcomes
fig, ax = plt.subplots(1,1, figsize=(5,5))
from SDSP import plot_pdf
mu,var,sigma = get_metrics(stats_df)
plot_pdf(stats_df, ax, n_pickups, mu, sigma, p=mu, bar_width=round(0.5/n_pickups,3),
         title='Sampling Distribution of\n a Sample Proportion')
plt.tight_layout()
plt.show()

输出:
红色曲线是不当法线近似曲线。 mu和sigma是从统计离散分布(蓝色小条)得出的,并馈入公式计算正态曲线。但是法线看起来以某种方式放大了。
output image

更新
避免除法取平均值,解决图形问题但亩定比例。因此问题仍未完全解决。 :(

X_mean = sum(X_hat) # removed the division /len(X_hat)

除去上述除法后的输出(但是否需要?)
output

0 个答案:

没有答案