Question

我正在进行蒙特卡罗模拟，其中每次重复都需要随机数随机变量的总和或乘积。我的问题是如何有效地做到这一点，因为整个模拟应该尽可能地矢量化。

例如，假设我们想要取5，10和3随机数之和，由向量len = [5;10;3]表示。那么我目前正在做的是绘制一个完整的随机数矩阵：

A = randn(length(len),max(len));

创建不需要的数字的掩码：

lenlen = repmat(len,1,max(len));
idx = repmat(1:max(len),length(len),1);
mask = idx>lenlen;

然后我可以“填充”矩阵，因为我感兴趣的是填充必须为零的总和（对于产品填充必须为1的情况）

A(mask)=0;

获取：

A =

1.7708   -1.4609   -1.5637   -0.0340    0.9796         0         0         0         0         0
1.8034   -1.5467    0.3938    0.8777    0.6813    1.0594   -0.3469    1.7472   -0.4697   -0.3635
1.5937   -0.1170    1.5629         0         0         0         0         0         0         0

以后我可以将它们加在一起

B = sum(A,2);

然而，我发现我必须绘制太多的随机数然后将它们扔掉，这是多余的。在实际情况中，我需要在数十万次重复的范围内，向量len可能会有很大差异，也就是说我很容易得到两倍或三倍的随机数而不是需要。

Answer 1

您可以生成所需的确切数量的随机数，使用repelem创建分组变量，并使用accumarray计算每个组的总和：

len = [5; 10; 3];
B = accumarray(repelem(1:numel(len), len).', randn(sum(len),1));

Answer 2

您可以使用$user = User::find(1); // my user object // I want to add my defined user profile from the user_profile relationship $userdata = $user->with('user_profile')->get();或循环。你说＆＃34;高效＆＃34; ＆＃34;矢量化＆＃34;在同一个呼吸中，但它们不一定是同一个东西 - 因为新的（ish）JIT编译器，循环在MATLAB中非常快。 arrayfun基本上是伪装的循环，但意味着您可以像这样创建arrayfun：

对于len = [5;10;3]; B = arrayfun( @(x) sum( randn(x,1) ), len );中的每个元素，这将创建一个长度为len的向量并获取总和。输出是一个数组，len(i)中的每个值都有一个值。

对于大值和len内的大部分不同的值，这肯定会更加友好。因此它可能会更快，您的里程可能会有所不同，但会削减您正在进行的大量操作。

您提到有时需要使用该产品，在这种情况下使用len代替prod。

编辑：粗略和准备好的基准来比较sum和循环...

arrayfun

＆＃34;优势＆＃34; len = randi([1e3, 1e7], 100, 1); tic; B = arrayfun( @(x) sum( randn(x,1) ), len ); toc % ~8.77 seconds tic; out=zeros(size(len)); for ii = 1:numel(len) out(ii) = sum(randn(len(ii),1)); end toc % ~8.80 seconds上的循环是你可以一次性预生成所有随机数，然后索引。这并不是更快，因为你要处理更大的内存块，而arrayfun的调用无论如何都是主要的瓶颈！

randn

正如在顶部所述，tic; out = zeros(size(len)); rnd = randn(sum(len),1); idx = [0; cumsum(len)]; % note: cumsum is very quick (~0.001sec here) so negligible for ii = 1:numel(len) out(ii) = sum(rnd(idx(ii)+1:idx(ii+1)),1); end toc % ~10.2 sec! Slower because of massive call to randn and the indexing into large array.和循环在引擎盖下基本相同，所以没有理由期待很大的时差。

Answer 3

从特定分布中提取的多个随机数的总和也是具有（不同的）特定分布的随机数。因此，您可以切断中间人并直接从后一个分布中进行绘制。

在您的情况下，您将从N（0,1）分布中抽取3,10和5个数字。如here所解释的，因此得到的分布是N（0,3），N（0,10）和N（0,5）。 This page解释了如何从Matlab中的非标准正态分布中提取。因此，在这种情况下，我们可以使用randn(3,1).*sqrt([5; 10; 3])生成这些数字。

如果您需要1000个三元组，则可以使用

randn(3,1000).*sqrt([5; 10; 3])

或预先Matlab2016b

bsxfun(@times, randn(3,1000), sqrt([5; 10; 3]))

当然非常快。

不同的发行版具有不同的求和规则，但只要您不总结从不同发行版中提取的数字，规则通常非常简单，并且可以通过Google快速找到。

Answer 4

您可以使用cumsum和diff的组合来完成此操作。计划是：

在预先randn预先
然后，使用cumsum生成累积求和向量
在每个结果的样本数量列表中使用cumsum以确定在哪里读取结果
我们还需要diff来纠正先前的摘要。

请注意，如果您未对随机样本使用randn，则此方法可能会失去准确性，因为cumsum会导致算术舍入错误。

% We want 100 sums of random numbers
numSamples = 100;
% Here's where we define how many random samples contribute to each sum
numRandsPerSample = randi(5, 1, numSamples);

% Let's make all the random numbers in one call
allRands = randn(1, sum(numRandsPerSample));

% Use CUMSUM to build up a cumulative sum of the whole of allRands. We also
% need a leading 0 for the first sum.
allRandsCS = [0, cumsum(allRands)];
% Use CUMSUM again to pick out the places we need to pick from 
% allRandsCS
endIdxs = 1 + [0, cumsum(numRandsPerSample)];
% Use DIFF to subtract the prior sums from the result.
result = diff(allRandsCS(endIdxs))

绘制随机数的非完整矩阵

4 个答案: