Matlab:对矩阵的唯一列元素索引进行分组和最小化

时间:2012-12-02 12:16:18

标签: matlab size unique minimize

我们有以下矩阵result

result =
  Columns 1 through 13
     3     1     1     1     1     1     6     2     3     6     2     1     6
     4     3     3     5     7     5    10    10     4    10     6     9     8
     6     4     4     7     9     7     0     0     0     0     0     0     0
    10     5     5     8     0     0     0     0     0     0     0     0     0
  Columns 14 through 25
     2    10     3    10     3     8     8     0     0     0     0     0
     8     0     0     0     0     0     0     0     0     0     0     0
     0     0     0     0     0     0     0     0     0     0     0     0
     0     0     0     0     0     0     0     0     0     0     0     0

其列唯一元素索引大小为(不带零):

Indexes of result:
  Columns 1 through 13
     4     4     4     4     3     3     2     2     2     2     2     2     2
  Columns 14 through 25 
     2     1     1     1     1     1     1 

我想执行以下方案: 从第一列开始,我们希望将每个非唯一值限制为仅在矩阵中出现一次。 因此,以col1为起点,矩阵的其余部分应重新排列为:

result =
  Columns 1 through 13
     3     1     1     1     1     1     0     2     0     0     2     1     0
     4     0     0     5     7     5     0     0     0     0     0     9     8
     6     0     0     7     9     7     0     0     0     0     0     0     0
    10     5     5     8     0     0     0     0     0     0     0     0     0
  Columns 14 through 25
     2     0     0     0     0     8     8     0     0     0     0     0
     8     0     0     0     0     0     0     0     0     0     0     0
     0     0     0     0     0     0     0     0     0     0     0     0
     0     0     0     0     0     0     0     0     0     0     0     0
 Indexes of result (without zeros):
  Columns 1 through 13
     4     2     2     4     3     3     0     1     0     0     1     2     1
  Columns 14 through 25 
     2     0     0     0     0     1     1      

现在我们看到col4具有最独特的元素,因此我们认为其值继续进行第二次重新排列,结果是:

result =
  Columns 1 through 13
     3     0     0     1     0     0     0     2     0     0     2     0     0
     4     0     0     5     0     0     0     0     0     0     0     9     0
     6     0     0     7     9     0     0     0     0     0     0     0     0
    10     0     0     8     0     0     0     0     0     0     0     0     0
  Columns 14 through 25
     2     0     0     0     0     0     0     0     0     0     0     0
     0     0     0     0     0     0     0     0     0     0     0     0
     0     0     0     0     0     0     0     0     0     0     0     0
     0     0     0     0     0     0     0     0     0     0     0     0

Indexes of result (without zeros):
  Columns 1 through 13
     4     0     0     4     1     0     0     1     0     0     1     1     0  
  Columns 14 through 25 
     1     0     0     0     0     1     1    

根据需要多次执行此操作,在该示例中,对于col5和col8两次,我们达到了预期的结果:

result =
  Columns 1 through 13
     3     0     0     1     0     0     0     2     0     0     0     0     0
     4     0     0     5     0     0     0     0     0     0     0     0     0
     6     0     0     7     9     0     0     0     0     0     0     0     0
    10     0     0     8     0     0     0     0     0     0     0     0     0
  Columns 14 through 25
     0     0     0     0     0     0     0     0     0     0     0     0
     0     0     0     0     0     0     0     0     0     0     0     0
     0     0     0     0     0     0     0     0     0     0     0     0
     0     0     0     0     0     0     0     0     0     0     0     0

Indexes of result (without zeros):
  Columns 1 through 13
     4     0     0     4     1     0     0     1     0     0     0     0     0  
  Columns 14 through 25 
     0     0     0     0     0     0     0 

哪种方法最有效? 我可以看看你的建议吗?

提前谢谢。

1 个答案:

答案 0 :(得分:1)

你的问题措辞不好,所以以下是我设法从中理解的一步一步细分。

假设您有以下矩阵:

result=[3  1  1  1  1  1  6  2  3  6  2  1  6  2 10  3 10 3  8  8  0  0  0  0  0;
        4  3  3  5  7  5 10 10  4 10  6  9  8  8  0  0  0  0 0  0  0  0  0  0  0;
        6  4  4  7  9  7  0  0  0  0  0  0  0  0  0  0  0  0 0  0  0  0  0  0  0;
       10  5  5  8  0  0  0  0  0  0  0  0  0  0  0  0  0  0 0  0  0  0  0  0  0]

1)要计算每列中唯一元素的数量,只需在每列上调用unique并计算非零元素:

count = arrayfun(@(n)sum(unique(result(:, n)) ~= 0), 1:size(result, 2))

2)要使第1列的所有重复元素无效,我们可以这样做:

idx = arrayfun(@(n)ismember(result(:, n), result(:, 1)), 2:N, 'Uniform', 0);
result(logical([idx{:}])) = 0

现在我们需要迭代所有列并使所有非唯一元素无效,所以我们用循环来做。因此,最终解决方案是:

N = size(result, 2);
ii = 0;
while (ii <= N)

    % # Count the number of unique elements in each column
    count = arrayfun(@(n)sum(unique(result(:, n)) ~= 0), 1:N);

    % # Advance to the next column with the maximum number of unique elements
    ii = ii + find(count(:, ii + 1:N) == max(count(:, ii + 1:N)) & count(ii + 1:N), 1);
    if isempty(ii)
        break
    end

    % # Nullify non-unique elements starting from column i
    idx = arrayfun(@(n)(ismember(result(:, n), result(:, ii)) & n ~= ii), 1:N, 'Uniform', 0);
    result(logical([idx{:}])) = 0;
end

产生您想要的结果:

result=
    3  0  0  1  0  0  0  2  0  0  0  0  0  0  0  0  0  0 0  0  0  0  0  0  0
    4  0  0  5  0  0  0  0  0  0  0  0  0  0  0  0  0  0 0  0  0  0  0  0  0
    6  0  0  7  9  0  0  0  0  0  0  0  0  0  0  0  0  0 0  0  0  0  0  0  0
   10  0  0  8  0  0  0  0  0  0  0  0  0  0  0  0  0  0 0  0  0  0  0  0  0

希望有所帮助!