计算置信区间 - Bootstap

时间:2021-06-07 14:39:16

标签: python-3.x scipy stat

我正在尝试计算具有 1000 个数字的列表的置信区间,并将其转换为具有两个变量的元组。 然而,我没有得到一个包含两个变量的元组,而是得到一个包含两个数组的元组,每个数组包含 1000 个区间。 这是我的代码:

def bootstrap(list):
"""in this line I made 1K lists with 16 numbers that was randomly picked"""
randomize = [[random.choice(list) for _ in list] for _ in range(1000)]
""" after that I used list comprehension and numpy to calculate mean and get 1 list with 1K means"""
means = [np.mean([i for i in sublist]) for sublist in randomize]
```then I tried to create two variable that each one is a sole number that represents the interval```
ci_left, ci_right = tuple(stats.t.interval(0.95, df =len(means) -1 , loc = means))
return (ci_left, ci_right)

但我的输出是这样的:

(array([-1.33077651, -1.30684806, -1.35418851, -1.32454884, -1.31485041,
   -1.28670879, -1.32344893, -1.38127905, -1.35198733, -1.33957749]),array([2.59390641, 2.61783486, 2.57049441, 2.60013409, 2.60983251,
   2.63797414, 2.60123399, 2.54340387, 2.57269559, 2.58510543,
   2.58198925, 2.56551404, 2.57899741, 2.59180679, 2.56566707,]))

我想得到的输出示例:

(0.607898431, 0.611159753)

感谢任何形式的帮助!

1 个答案:

答案 0 :(得分:0)

问题是我使用了手段变量而不是通过求和并除以len来对手段进行平均,我还需要添加一个比例,这就是答案:

def bootstrap(list):
"""in this line I made 1K lists with 16 numbers that was randomly picked"""
randomize = [[random.choice(list) for _ in list] for _ in range(1000)]
""" after that I used list comprehension and numpy to calculate mean and get 1 list with 1K means"""
means = [np.mean([i for i in sublist]) for sublist in randomize]
ci_left, ci_right = tuple(stats.t.interval(0.95, df =len(means) -1 , loc = sum(means)/len(means) , scale = scipy.stats.sem(means)))
return (ci_left, ci_right)