Question

我有N个bernoulli变量，X1，...，XN和Xi~B(1, pi)，pi已知每个Xi和{{ 1}}，现在我需要获得Y=X1+...XN的分配。

如果Y和Xi在Xj时是独立的，那么我可以使用模拟：

i!=j

但现在1. Generate `X1`, ..., `XN` via their distribution, and then get the value of `Y`; 2. Repet step 1 for 10000 times, and then I can get `Y1`, ..., `Y10000`, so I can konw the distribution of `Y`.和Xi是依赖的，所以我还需要考虑相关性，假设Xj何时corr(Xi, Xj)=0.2，我该如何插入相关性到模拟？或者通过其他方式获得Y的分布？

感谢您的帮助和建议。

Answer 1

您可以通过导出给定另一个的条件分布来生成特定的成对相关（在限制范围内）。限制是您不能拥有完全任意的p值和相关性。然而，N-choose-2成对相关集隐含的同时约束对于N，p值和相关性的任意选择是不可行的。

以下Ruby实现显示了获取一对X的指定p值和相关性的计算：

# Control the run with command-line args.
# If no args provided, default to test case of
# p1 = 0.2, p2 = 0.8, rho = -0.5, sample size = 10
p1 = (ARGV.shift || 0.2).to_f
p2 = (ARGV.shift || 0.8).to_f
rho = (ARGV.shift || -0.5).to_f
n = (ARGV.shift || 10).to_i

# Calculate conditional probabilities for p2 given p1 = 0, 1
p2_given = [p2 - rho * Math::sqrt(p1 * p2 * (1.0 - p2) / (1.0 - p1)),
            p2 + rho * Math::sqrt((1.0 - p1) * p2 * (1.0 - p2) / p1)]

printf "p2_given_1 = %8.5f, p2_given_0 = %8.5f\n", p2_given[1], p2_given[0]

# Only proceed to actually generate values if the conditional
# probabilities are between zero and one
if p2_given.inject(true) {|m, e| m &= (e >= 0 && e <= 1)}
  n.times do
    x1 = (rand <= p1) ? 1 : 0
    x2 = (rand <= p2_given[x1]) ? 1 : 0
    printf "%d,%d\n", x1, x2
  end
end

如何得到依赖bernoulli变量之和的分布

1 个答案: