比较相同行的两个单元阵列 - MATLAB

时间:2014-11-04 11:22:32

标签: performance matlab compare cells cell-array

我有一个包含40,000行的单元格字符串,另一行包含400。我需要在first矩阵中找到适合second的那些行(行)。请注意,可能会有很多重复。

看起来: 像{/ 1>这样的40,000

Anna Frank  
Anna George  
Jane Peter  
Anna George  
Jane Peter    
etc.

这里我需要找到合适的

Anna George  
Jane Peter  

我找到的唯一方法是两个for函数和一个if。但它很慢:

for i=2:size(bigTable,1)
    for j = 1: size(smallTable,1)
        if sum(ismember(bigTable(i,1:2),smallTable(j,1:2))) == 2
            Total_R(size(Total_R,1)+1,1)= i;
        end
    end
end

1 个答案:

答案 0 :(得分:3)

我假设您的输入设置如下 -

bigTable = 
    'Anna'    'Frank' 
    'Anna'    'George'
    'Jane'    'Peter' 
    'Anna'    'George'
    'Jane'    'Peter' 
smallTable = 
    'Anna'    'George'
    'Jane'    'Peter' 

要解决您的情况,可以在此处提出两种方法。

方法#1

基于{p> ismember的方法 -

Total_R = find(sum(ismember(bigTable,smallTable,'rows'),2)==2)

方法#2

%// Assign unique labels to each cell for both small and big cell arrays, so that
%// later on you would be dealing with numeric arrays only and 
%// do not have to mess with cell arrays that were slowing you down
[unqbig,matches1,idx] = unique([bigTable(:) ; smallTable(:)])
big_labels = reshape(idx(1:numel(bigTable)),size(bigTable))
small_labels = reshape(idx(numel(bigTable)+1:end),size(smallTable))

%// Detect which rows from small_labels exactly match with those from big_labels
Total_R  = find(ismember(big_labels,small_labels,'rows'))

或者使用基于ismember的实现替换最后一行中的bsxfun -

Total_R = find(any(all(bsxfun(@eq,big_labels,permute(small_labels,[3 2 1])),2),3))

假设输入案例的这些方法的输出 -

Total_R =
     2
     3
     4
     5