获取列组合的重叠计数

时间:2013-12-31 18:49:53

标签: matlab intersection overlap

我试图获得所有列和列组合之间所有交叉点的所有计数。

%I have a matrix of overlaps something like this : 
colHeader = {'var1','var2','var3','var4','var5'};
rowHeader = {'type1','type2','type3','type4','type5','type6','type7'};
overlap = [1,1,1,1,0;0,0,0,1,1;1,1,0,1,0;0,0,1,1,0;0,1,0,1,1;0,1,1,1,0;1,0,0,1,0];
%now i would like to get the count of overlap for all the columns variations (i.e. var1&var2 ... 
%var5&var1 at the first level, at the second level (var1&var2)&var3 etc. ) 
%the output in this case for level 1 and 2 is simple enough 
f = @(a,b) a&b
mat= zeros(5,5);
for i=1:5
    for j=1:5
       mat(i,j) = sum(f(overlap(:,i),overlap(:,j)));
    end
end
%      3     2     1     3     0
%      2     4     2     4     1
%      1     2     3     3     0
%      3     4     3     7     2
%      0     1     0     2     2
% where the diagonal is the first level of overlap and the rest are the relationships between the 
% different variables 
% i can continue in this fashion but not only is this ugly, it becomes not practical when dealing        
%with 
% bigger matrixes 
% So the problem is how to implement this for a big binary matrix in a manner that will return all 
% levels of intersection ? 

1 个答案:

答案 0 :(得分:1)

temp = 1;
for level = 1:size(overlap,2)
    temp = bsxfun(@and, temp, permute(overlap,[1 3:1+level 2]));
    result{level} = squeeze(sum(temp));
end

如何解释结果

变量result是一个包含所有级别结果的单元格数组。设n表示overlap中的列数。

  • 等级1:result{1}是1 x n向量,它给出overlap的每一列与其自身的交集(即每列的总和)。例如,result{1}(4)overlap第4列中的数量。

  • 第2级:result{2}是一个n x n矩阵。例如,result{2}(4,2)overlap的第4列和第2列的交集。 (result{2}在原帖中为mat

  • 等级3:result{3}是一个n x n x n数组。例如,result{3}(4,2,5)overlap的第4列,第2列和第5列的交集。

  • [...]直到等级n。

代码如何运作

在给定级别计算结果时,代码使用上一级别的中间结果。这可以完成,因为“和”操作是关联。例如,在第3级,overlap(:,k) & overlap(:,m) & overlap(:,p)可以计算为(overlap(:,k) & overlap(:,m)) & overlap(:,p),其中overlap(:,k) & overlap(:,m)已经在第2级计算(并存储)。

每个级别(result{level})的最终结果将作为逐列总和获得。但是,存储该总和之前的中间结果(变量temp)将在下一级重复使用。

每个新级别从前一级别(temp)获取中间结果,添加新维度(permute)并计算(bsxfun)新的中间结果(新值) temp,还有一个维度)。这个中间结果,按列sum(和squeeze删除单例维度),给出该级别的最终结果(result{level})。