Question

Matlab / Octave算法示例：

 input vector: [ 1 0 2 0 7 7 7 0 5 0 0 0 9 ]
output vector: [ 1 1 2 2 7 7 7 7 5 5 5 5 9 ]

算法非常简单：它遍历向量并用最后一个非零值替换所有零。这似乎是微不足道的，并且当使用缓慢的（i = 1：长度）循环并且能够引用前一个元素（i-1）时是如此，但看起来不可能以快速矢量化形式表示。我尝试了merge（）和shift（）但它只适用于第一次出现的零，而不是任意数量的。

可以在Octave / Matlab中以矢量化形式完成，还是必须使用C才能在大量数据上获得足够的性能？

我有another similar slow for-loop algorithm to speed up并且通常无法以矢量化形式引用以前的值，例如SQL lag()或group by或loop (i-1)很容易做到。但Octave / Matlab循环速度非常慢。

有没有人找到这个一般性问题的解决方案，或者这对于基本的Octave / Matlab设计原因是徒劳的？

绩效基准：

解决方案1（慢速循环）

in = repmat([ 1 0 2 0 7 7 7 0 5 0 0 0 9 ] ,1 ,100000);
out = in;
tic
for i=2:length(out) 
   if (out(i)==0) 
      out(i)=out(i-1);
   end
end
toc
[in(1:20); out(1:20)] % test to show side by side if ok

经过的时间是15.047秒。

Dan的解决方案2（快约80倍）

in = V = repmat([ 1 0 2 0 7 7 7 0 5 0 0 0 9 ] ,1 ,100000);
tic;
d = double(diff([0,V])>0);
d(find(d(2:end))+1) = find(diff([0,~V])==-1) - find(diff([0,~V])==1);
out = V(cumsum(~~V+d)-1);
toc;
[in(1:20); out(1:20)] % shows it works ok

经过的时间是0.188167秒。

15.047 / 0.188167 = 79.97倍的改善

<2> GameOfThrows的解决方案3（快〜115倍）

in = repmat([ 1 0 2 0 7 7 7 0 5 0 0 0 9 ] ,1 ,100000);
a = in;
tic;
pada = [a,888];
b = pada(pada >0);
bb = b(:,1:end-1);
c = find (pada==0);
d = find(pada>0);
len = d(2:end) - (d(1:end-1));
t = accumarray(cumsum([1,len])',1);
out = bb(cumsum(t(1:end-1)));
toc;

经过的时间是0.130558秒。

15.047 / 0.130558 = 115.25倍改善

神奇解决方案4由路易斯·门多（约250倍）

in = repmat([ 1 0 2 0 7 7 7 0 5 0 0 0 9 ] , 1, 100000);
tic;
u = nonzeros(in);
out = u(cumsum(in~=0)).';
toc;

经过的时间是0.0597501秒。

15.047 / 0.0597501 = 251.83倍改善

（更新2019/03/13）使用MATLAB R2017a进行计时：

Slow loop:    0.010862 seconds.
Dan:          0.072561 seconds.
GameOfThrows: 0.066282 seconds.
Luis Mendo:   0.032257 seconds.
fillmissing:  0.053366 seconds.

所以我们再次得出相同的结论：MATLAB中的循环不再慢！

另见： Trivial/impossible algorithm challenge in Octave/Matlab Part II: iterations memory

Answer 1

以下简单方法可以满足您的需求，而且可能非常快：

in = [1 0 2 0 7 7 7 0 5 0 0 0 9];
t = cumsum(in~=0);
u = nonzeros(in);
out = u(t).';

Answer 2

我认为这是可能的，让我们从基础开始，你想要捕获数字大于0的位置：

 a = [ 1 0 2 0 7 7 7 0 5 0 0 0 9 ] %//Load in Vector
 pada = [a,888];  %//Pad A with a random number at the end to help in case the vector ends with a 0
 b = pada(find(pada >0)); %//Find where number if bigger than 0
 bb = b(:,1:end-1);     %//numbers that are bigger than 0
 c = find (pada==0);   %//Index where numbers are 0
 d = find(pada>0);     %//Index where numbers are greater than 0
 length = d(2:end) - (d(1:end-1));  %//calculate number of repeats needed for each 0 trailing gap.
 %//R = [cell2mat(arrayfun(@(x,nx) repmat(x,1,nx), bb, length,'uniformoutput',0))]; %//Repeat the value

 ----------EDIT--------- 
 %// Accumarray and cumsum method, although not as nice as Dan's 1 liner
 t = accumarray(cumsum([1,length])',1);
 R = bb(cumsum(t(1:end-1)));

注意：我使用了arrayfun，但您也可以使用accumarray。我认为这表明可以并行执行此操作吗？

R =

第1至10栏

 1     1     2     2     7     7     7     7     5     5

第11至13栏

 5     5     9

测试：

a = [ 1 0 2 0 7 7 7 0 5 0 0 0 9 0 0 0 ]

R =

第1至10栏

 1     1     2     2     7     7     7     7     5     5

第11至16栏

 5     5     9     9     9     9

性能：

a = repmat([ 1 0 2 0 7 7 7 0 5 0 0 0 9 ] ,1,10000); %//Double of 130,000
Arrayfun Method : Elapsed time is 6.840973 seconds.
AccumArray Method : Elapsed time is 2.097432 seconds.

Answer 3

我认为是一个矢量化解决方案。适用于您的示例：

V = [1 0 2 0 7 7 7 0 5 0 0 0 9]
%// This is where the numbers you will repeat lie. You have to cast to a double otherwise later when you try assign numbers to it it caps them at logical 1s
d = double(diff([0,V])>0)
%// find(diff([0,~V])==-1) - find(diff([0,~V])==1) is the length of each zero cluster
d(find(d(2:end))+1) = find(diff([0,~V])==-1) - find(diff([0,~V])==1)
%// ~~V is the same as V ~= 0
V(cumsum(~~V+d)-1)

Answer 4

这是另一种解决方案，使用linear interpolation with previous neighbor lookup。

我认为它也很快，因为只有查找和索引而且没有计算：

in = [1 0 2 0 7 7 7 0 5 0 0 0 9]
mask = logical(in);
idx = 1:numel(in);
in(~mask) = interp1(idx(mask),in(mask),idx(~mask),'previous');
%// out = in

说明

您需要创建索引向量：

idx = 1:numel(in)  $// = 1 2 3 4 5 ...

一个逻辑掩码，掩盖所有非零值：

mask = logical(in);

这样，您可以获得插值的网格点idx(mask)和网格数据in(mask)。查询点idx(~mask)是零数据的索引。然后，计算查询数据in(~mask)＆＃34;通过下一个上一个邻居插值，所以它基本上在网格中查看前一个网格点的值是什么。正是你想要的。不幸的是，所涉及的函数对于所有可想象的案例都有巨大的开销，这就是为什么它仍然比Luis Mendo的答案慢，尽管没有涉及算术计算。

此外，可以减少interp1的开销：

F = griddedInterpolant(idx(mask),in(mask),'previous');
in(~mask) = F(idx(~mask));

但效果不是太大。

in =   %// = out

     1     1     2     2     7     7     7     7     5     5     5     5     9

基准

0.699347403200000 %// thewaywewalk
1.329058123200000 %// GameOfThrows
0.408333643200000 %// LuisMendo
1.585014923200000 %// Dan

<强>代码

function [t] = bench()
    in = repmat([ 1 0 2 0 7 7 7 0 5 0 0 0 9 ] ,1 ,100000);

    % functions to compare
    fcns = {
        @() thewaywewalk(in);
        @() GameOfThrows(in);
        @() LuisMendo(in);
        @() Dan(in);
    }; 

    % timeit
    t = zeros(4,1);
    for ii = 1:10;
        t = t + cellfun(@timeit, fcns);
    end
    format long
end

function in = thewaywewalk(in) 
    mask = logical(in);
    idx = 1:numel(in);
    in(~mask) = interp1(idx(mask),in(mask),idx(~mask),'previous');
end
function out = GameOfThrows(a) 
    pada = [a,888];
    b = pada(find(pada >0));
    bb = b(:,1:end-1);
    c = find (pada==0);
    d = find(pada>0);
    length = d(2:end) - (d(1:end-1));
    t = accumarray(cumsum([1,length])',1);
    out = bb(cumsum(t(1:end-1)));
end
function out = LuisMendo(in) 
    t = cumsum(in~=0);
    u = nonzeros(in);
    out = u(t).';
end
function out = Dan(V) 
    d = double(diff([0,V])>0);
    d(find(d(2:end))+1) = find(diff([0,~V])==-1) - find(diff([0,~V])==1);
    out = V(cumsum(~~V+d)-1);
end

Answer 5

MATLAB R2016b的新增功能：fillmissing，其功能完全符合问题中所述：

in = [ 1 0 2 0 7 7 7 0 5 0 0 0 9 ];
in(in==0) = NaN;
out = fillmissing(in,'previous');

[在this duplicate question中发现的这项新功能]。

Answer 6

矢量操作通常假设各个项目的独立性。如果您依赖于之前的项目，那么循环是最好的方法。

matlab上的一些额外背景：在matlab中，操作通常更快，不是因为特定的向量操作，而是因为向量操作只是在本机C ++代码中而不是通过解释器进行循环

用先前的非零值替换向量中的所有零

解决方案1（慢速循环）

Dan的解决方案2（快约80倍）

神奇解决方案4由路易斯·门多（约250倍）

（更新2019/03/13）使用MATLAB R2017a进行计时：

6 个答案:

说明

基准