使用bootstrapping random.choice

时间:2016-12-05 20:32:51

标签: python numpy bootstrapping

我正在尝试使用bootstrapping对子进行1000次重复(np.random.choice)进行重新采样替换,我可以计算每次复制的均值。然后将这些平均值的标准偏差与标准值进行比较。

但是我没有得到正确的引导部分,如何修复那部分?

import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import seaborn as sns
from scipy import stats

df = pd.read_csv('http://www.math.uah.edu/stat/data/Pearson.txt',
                 delim_whitespace=True)
df.head()
y = df['Son'].values

Replications = np.random.choice(y, 1000, replace = True)
print("Replications: " , Replications)
print("")
Mean = np.mean(Replications)

print("Mean: " , Mean)

sem = stats.sem(y)
print ("The SEM : ", sem)

1 个答案:

答案 0 :(得分:2)

您可以按如下方式创建1000个长度为len(df)的复制:

Replications = np.array([np.random.choice(df.Son, len(df), replace = True) for _ in range(1000)])
Mean = np.mean(Replications, axis=1)
print("Mean: " , Mean)

谢谢!