从Evan Miller的简单顺序A / B测试博客创建A / B测试样本量计算器

时间:2018-07-16 05:54:56

标签: python testing automated-tests rates

要了解有关A / B测试样本大小选择的更多信息,我尝试使用Evan Miller受欢迎的博客文章重新创建样本大小计算器(https://www.evanmiller.org/sequential-ab-testing.html)。但是,似乎有一个错误使我无法重新创建文章中给出的样本大小。您会建议我重新检查以找到解决问题的方法吗?

此错误必须在我的计算或对问题的阅读中。求解约束方程式表明样本量很小〜6。该文章建议,检验统计数据的临界值是样本量的函数,该临界值指示该处理是否比对照具有更高的转化率。然后列出两个不等式来求解一个变量N,即样本大小。如何重现样本量计算?

如果测试统计量截止值d_star不是N的函数,那么大多数问题都会消失。

d_star = z * Sqrt(N),N是样本大小,z是正常的z值

但是,在出现在文章后半部分的表上,d_star = z * Sqrt(N)非常精确地将N和d_star关联起来,这表明d_star随N而变化。

给定的alpha和beta约束: 总和R(p = 1 /(1 + delta))> 1-Beta 总和R(p = 1/2)

我将附加我的Python 2.7代码和每个约束方程的图。

#### Begin Python Code to Calculate Sample Size ####
import random
import scipy.stats
import math
import sys
print sys.version

# Functions and helper functions to 
# calculate the sample size.
def calcStatPowerSum(N, script_delta):
    z = scipy.stats.norm.isf(alpha/2.0) #1.96 for alpha = 0.05
    d_star = z*(N**0.5)
    d_star = int(math.ceil(d_star))
    statPowerSum = 0
    for i in range(1, N+1):
        statPowerSum += (float(d_star)/i)*scipy.stats.binom.pmf((i+d_star)//2, i, 1.0-float(1)/(2+script_delta))
        # p and the (1-p) terms are reversed relative to the binomial distribution

    return statPowerSum

def calcCritValueSum(N, script_delta):
    z = scipy.stats.norm.isf(alpha/2.0) #1.96 for alpha = 0.05
    d_star = z*(N**0.5)
    d_star = int(math.ceil(d_star))
    critValueSum = 0
    for i in range(d_star, N+1, 2):
        critValueSum += (float(d_star)/i)*scipy.stats.binom.pmf((i+d_star)//2, i, 0.5)

    return critValueSum

def determineSampleSize(alpha, beta, script_delta):
    z = scipy.stats.norm.isf(alpha/2.0) #1.96 for alpha = 0.05
    d=1
    N=int(math.ceil(z*z))-1
    statPowerSum = 0
    critValueSum = 1
    while (statPowerSum <= 1 - beta or critValueSum >= alpha) and N<3000:
        d+=1
        N=int(math.floor(d*d/z/z))
        statPowerSum = calcStatPowerSum(N,  script_delta)
        critValueSum = calcCritValueSum(N, script_delta)

    return N


alpha = 0.05
beta = 0.8
lift = script_delta = 0.10
sampleSize = determineSampleSize(alpha, beta, script_delta)
print("beta:   ", beta, ", alpha:   ", alpha, ", sampleSize:   ", sampleSize)

##  The article suggests that N=2922 satisfies the constraint equations.
##  But the calculation suggests otherwise.
print("calcCritValueSum: ", calcCritValueSum(2922, script_delta))
print("statPowerSum:     ", calcStatPowerSum(2922, script_delta))

#### End Python Code ####

以下mathematica代码将约束图再现为d_star的函数。将代码复制并粘贴到https://sandbox.open.wolframcloud.com/中将生成这些图。一次生成多个图将超出免费使用限制。

(*   alpha constraint Plot   *)
z=1.96    (*alpha=0.05*)
Table[Sum[(d/n)*PDF[BinomialDistribution[Floor[d^2/z^2], 1/2], (d+n)/2], {n, d, Floor[d^2/z^2], 2}],{d,1,130}]
ListLinePlot[%, PlotRange->All, AxesLabel->{d_star,prob},PlotLabel->alpha Constraint Plot]


(*   1-beta constraint Plot   *)
delta=0.10
z=1.96    (*alpha=0.05*)
Table[Sum[(d/n)*PDF[BinomialDistribution[Floor[d^2/z^2], (1+delta)/(2+delta)], (d+n)/2], {n, d, Floor[d^2/z^2], 2}],{d,1,130}]
ListLinePlot[%, PlotRange->All, AxesLabel->{d_star,prob},PlotLabel->1- beta Constraint Plot]

0 个答案:

没有答案