Question

在Matlab中我们有这样的场景：

v =[1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 .... N N N N];

其中v中的元素总是从1到N的增加顺序，我们知道N的值。我们想要计算＆＃39; 1＆＃39; 2＆＃39; 2的数量。 ......在v。

当然，我们可以使用如下循环：

for i =  1 : N
    % method A
    tic
    ind = find(v == i)
     ---> do sth with ind 
    t1 = toc;

    % method B
    tic
    ind = v(v == i)
     ---> do sth with ind 
    t2 = toc;

    % method C
    tic
    ind = ismember(v , i)
     ---> do sth with ind 
    t3 = toc;


end

这些方法的时间大致等于$ t1 = 0.02秒$，$ t2 = 0.02秒$和$ t3 = 0.03秒$。在我的实际工作中，N很大，整个循环需要2-3个小时！

您是否有任何想法可以增加完成此过程的时间？任何想法都表示赞赏。

Answer 1

特定情况：排序输入，仅计数

如果您希望获得计数，可以在此处提出一些方法。

方法＃1：

accumarray(v(:),1)

方法＃2：

diff([0 find([diff(v) 1])])

方法＃3：

histc(v,1:max(v))

为了表现，我打赌diff，然后是accumarray，最后一个打histc。

通用案例：未排序的输入，计数＆amp;索引

对于输入向量v未排序且您可能还需要与每组相同数字对应的索引的一般情况，这里有一种方法将索引存储在单元格数组中 -

[~,sort_idx] = sort(v); sorted_v = v(sort_idx); counts = diff([0 find([diff(sorted_v) 1])]) indices_per_grp = mat2cell(sort_idx,1,counts);

示例运行 -

v = 2 1 3 3 2 4 1 2 1 1 4 3 4 3 counts = 4 3 4 3 indices_per_grp{1} = 2 7 9 10 indices_per_grp{2} = 1 5 8 indices_per_grp{3} = 3 4 12 14 indices_per_grp{4} = 6 11 13

Answer 2

我在这里偏向bsxfun：

counts = sum(bsxfun(@eq,v(:),1:max(v)));

Answer 3

由于它们已经过排序，因此您可以提高效率！

怎么样

lastfound = 1;
for i =  1 : N
   % find first location after current pos, where value is i
   indStart = find(v(lastfound:end) == i, 1)
   % find first location, after this, where value is not i
   indEnd   = find(v(indStart:end) ~= i, 1)
   % now you have the range indStart:indEnd-1
   ...

   lastfound = indEnd; % start next one after the end of the current value
 end

即。只需从最后找到的项目中搜索转发。

find(..., 1)找到第一个项目，我相信。

在MATLAB中有效地循环向量

3 个答案: