我正在尝试使用MCMCpack
两个转换率之间的差异来获得后验分布,类似于this PyMC tutorial.的 A和B Together 部分
我可以很好地得到两个采样率的后验,但我正在努力如何实现采样的delta ..任何想法?
编辑真正的增量(如果我们没有编制数据并且我们想要使用MCMC进行估算,那将是未知的)是两个费率true_p_a
之间的差异和true_p_b
即 0.01 。
# define true success rates
true_p_a = 0.05
true_p_b = 0.04
# set sample sizes
n_samples_a = 1000
n_samples_b = 1000
# fabricate some data
set.seed(10);
obs_a = rbinom(n=n_samples_a, size=1, prob=true_p_a)
set.seed(1);
obs_b = rbinom(n=n_samples_b, size=1, prob=true_p_b)
# what are the observed conversion rates?
mean(obs_a) #0.056
mean(obs_b) #0.042
# convert to number of successes
successes_a = sum(obs_a) #56
successes_b = sum(obs_b) #42
# calculate the posterior
require(MCMCpack)
simulations = 20000
posterior_a = MCbinomialbeta(successes_a ,n_samples_a, alpha=1, beta=1,mc=simulations)
posterior_b = MCbinomialbeta(successes_b ,n_samples_b, alpha=1, beta=1,mc=simulations)
posterior_delta = ????
posterior_density_a = density(posterior_a)
posterior_density_b = density(posterior_b)
# plot the posteriors
require(ggplot2)
ggplot() +
geom_area(aes(posterior_density_a$x, posterior_density_a$y), fill="#7ad2f6", alpha=.5) +
geom_vline(aes(xintercept=.05), color="#7ad2f6", linetype="dotted", size=2) +
geom_area(aes(posterior_density_b$x, posterior_density_b$y), fill="#014d64", alpha=.5) +
geom_vline(aes(xintercept=.04), color="#014d64", linetype="dotted", size=2) +
scale_x_continuous(labels=percent_format(), breaks=seq(0,0.1, 0.01))
答案 0 :(得分:2)
你只是在挣扎,因为你还没有完全采用贝叶斯心态。它完全没问题,当我开始时,我有很多相同的概念问题。 (这个问题很古老,所以你可能已经把它想出来了。)
贝叶斯后验密度包含有关模型参数分布的所有可用信息。因此,要计算模型的任何参数的函数,您只需从后验分布计算每个绘图的函数。您不必担心标准错误和渐近推断,因为您已经拥有了所需的所有信息。
在这种情况下,由于参数之间的差异是常数,并且您有大量数据,因此几乎没有关于delta的不确定性。估计平均值为0.014,SD(不是SE)为.009。
您的代码已完成分析:
# define true success rates
true_p_a = 0.05
true_p_b = 0.04
# set sample sizes
n_samples_a = 1000
n_samples_b = 1000
# fabricate some data
set.seed(10);
obs_a = rbinom(n=n_samples_a, size=1, prob=true_p_a)
set.seed(1);
obs_b = rbinom(n=n_samples_b, size=1, prob=true_p_b)
# what are the observed conversion rates?
mean(obs_a) #0.056
mean(obs_b) #0.042
# convert to number of successes
successes_a = sum(obs_a) #56
successes_b = sum(obs_b) #42
# calculate the posterior
require(MCMCpack)
simulations = 20000
posterior_a = MCbinomialbeta(successes_a ,n_samples_a, alpha=1, beta=1,mc=simulations)
posterior_b = MCbinomialbeta(successes_b ,n_samples_b, alpha=1, beta=1,mc=simulations)
# Subtract the posterior deltas, look at empirical summaries and plot the empirical density function
posterior_delta = posterior_a - posterior_b
summary(posterior_delta)
require(ggplot2)
ggplot(data.frame(delta=as.numeric(posterior_delta)),aes(x=delta)) + geom_density() + theme_minimal()