Question

我在Matlab中有一个1x2单元格A。 A{i}是30494866x1维度为i=1,2的单元格。 A{i}(j)是1x21和i=1,2的{{1}}字符。

例如，我在此处报告j=1,...,30494866

A{2}(1:3)

我想计算重复'116374117927631468606' '112188647432305746617' '116374117927631468606'中每个1x21个字符的次数。例如，只考虑A{2}，我想得到

A{2}(1:3)

我现在正在做的是

'116374117927631468606'  2
'112188647432305746617'  1

然而，这是非常缓慢的（从昨天开始运行）。您对如何加快代码有任何建议吗？

Answer 1

因为你想知道每个21字符串使用了多少次：

1) sort the cell
2) count how many times each string is used in a for loop.

你的代码是O（n ^ 2）所以它很慢。这应该不到一分钟。

根据您的代码

B=sort(A{2}); 
U=sort(unique(B)); 
C=zeros(numel(U),1); 
cnt = 1; 
for j=1:numel(B) 
    if strcmp(U(cnt),B(j))==1 
        C(cnt)=C(cnt)+1; 
    else
        cnt = cnt +1; 
        if cnt <= numel(U)
            C(cnt) = C(cnt)+1;
        end
    end 
end

Answer 2

您可以使用标准unique - accumarray情侣：

执行此操作

data = {'116374117927631468606'
         '112188647432305746617'
         '116374117927631468606'};
[uu, ~,  ww] = unique(data, 'stable');
count = accumarray(ww, 1);
result = [uu, num2cell(count)];

或者，更节省内存：

data = {'116374117927631468606'
         '112188647432305746617'
         '116374117927631468606'};
[~, vv,  ww] = unique(data, 'stable');
count = accumarray(ww, 1);
result = [data(vv) num2cell(count)];

在Matlab中计算char的单元格中的频率：快速代码？

2 个答案: