Question

我对Matlab

中的以下模拟所期望的输出感到困惑

n_m=1000; 
n_w=1000; 

%Construct a vector eta by drawing n_m*n_w+n_m realisations from N(0,1) and then add n_w zeros
eta=randn(n_m*n_w+n_m,1); %(n_m*n_w+n_m)x1
eta=[eta; zeros(n_w,1)]; %(n_m*n_w+n_m+n_w)x1 

%Construct a vector zeta by drawing n_m*n_w+n_w realisations from N(0,1) and then add n_m zeros
zeta=randn(n_m*n_w+n_w,1); %(n_w*n_m+n_w)x1
zeta=[zeta(1:n_m*n_w); zeros(n_m,1); zeta(n_m*n_w+1:end)]; %(n_m*n_w+n_m+n_w)x1 

%Compare the empirical cdf of eta+zeta, of a N(0,1), of a N(0,2)
etazeta=eta+zeta;
benchmark1=randn(10000,1); %N(0,1)
benchmark2=normrnd(0,sqrt(2),10000,1); %N(0,2)
ecdf(etazeta)
hold on
ecdf(benchmark1)
hold on
ecdf(benchmark2)
legend('my', 'N(0,1)', 'N(0,2)')

我们可以看到etazeta的经验cdf接近于N(0,2)的经验cdf。

我很困惑为什么会这样：因为我向向量eta和zeta添加了一堆零，eta+zeta如何看起来像{{N(0,2) 1}}好像零没有？

Answer 1

从数学角度来看，两个正态分布变量Z和X的总和Y定义如下（有关详细信息，请参阅this Wikipedia link）：

我将n_m和n_w的脚本设置修改为10以简化演示，但结果基本不变。让我们首先深入了解代码生成的eta和zeta向量的组成：

eta  = 110 x N(0,1) + 10 x ZEROS
zeta = 100 x N(0,1) + 10 x ZEROS + 10 x N(0,1)

为了更清楚地了解这种构图，让我们把它分解成同样大小的部分：

eta  = 100 x N(0,1) + 10 x N(0,1) + 10 x ZEROS
zeta = 100 x N(0,1) + 10 x ZEROS  + 10 x N(0,1)

由于eta和zeta汇总在一起，因此求和的结果为：

eta     = 100 x N(0,1)            + 10 x N(0,1) + 10 x ZEROS
zeta    = 100 x N(0,1)            + 10 x ZEROS  + 10 x N(0,1)

etazeta = 100 x [N(0,1) + N(0,1)] + 10 x N(0,1) + 10 x N(0,1)
etazeta = 100 x N(0,2)            + 20 x N(0,1)

在etazeta中，我们基本上有：

100值为0和方差2（这是因为两个向量的前100个值通常用均值分布 0和方差1，因此N(0,1) + N(0,1) = N(0+0,1+1) = N(0,2)）
20值为0和方差1（0填充向量平衡eta和{{1}的最后一部分的两边}）

如果我们计算zeta的平均均值和方差，我们有：

etazeta

因此，AVG(mean) = (100 * 0 + 20 * 0) / 120 = 0 AVG(variance) = (100 * 2 + 20 * 1) / 120 = 1.83更接近etazeta变量而非N(0,2)变量（但仍然不同于两者）。您增加N(0,1)和n_m值越多，n_w变得与etazeta变量相似，就会收敛。

因此，Matlab计算和代码都没有错。

在Matlab中输出模拟涉及从正态分布中提取

1 个答案: