Question

我有一个3XN大小的矩阵。矩阵中的每列都是3d点。我想要删除重复项，我只关心前两个维度中的重复项。如果存在重复点（即x，y相同），我想选择第三维中具有最高值的那个（z坐标）。例如：（前2个维度是前2行）

M = [ 1 1 1 2 3 4 5 5 ;
      4 4 4 6 6 3 2 2 ;
      3 4 5 3 4 5 7 8 ];
          ^ ^ ^ ^   ^

我想得到：

Res = [ 1 2 3 4 5 ;
       4 6 6 3 2 ;
       5 3 4 5 8]

我需要它尽可能快地工作，因为矩阵非常大。所以，如果可能的话，不进行排序。我正在寻找一个matlab“快捷方式”来做到这一点，没有循环或排序。谢谢 matlabit

Answer 1

使用accumarray：

可以轻松有效地完成这项工作

% - choose pairs of row/column indices - first two rows of M
% - accumulate using MAX of the values in the third row - this step removes the duplicates
res = accumarray(M(1:2,:)', M(3,:)', [], @max);

% i/j indices of non-zero entries in the result matrix are
% the unique index pairs, and the value is the maximum third row
% value for all duplicates
[i, j, v] = find(res);

% construct the result matrix
[i j v]'


ans =

 5     4     1     2     3
 2     3     4     6     6
 8     5     5     3     4

如果您的索引非常大并且由于内存原因无法创建矩阵res，则可以使用accumarray函数的稀疏版本 - 它会创建一个稀疏矩阵，该矩阵仅存储非 - 零条目。其余的保持不变：

res = accumarray(M(1:2,:)', M(3,:)', [], @max, 0, true);

Answer 2

扫描前两行并将元素插入max-heap。插入时，如果元素已经存在，您可以动态检查（在这种情况下不要将其插入堆中）。如果存在，则将其与当前最大值进行比较，并在需要时设置为最大值。最终的最大值是您寻求的结果。

构建堆的复杂性为O(n)，并且检查最大值不会破坏此边界。因此，如果使用排序，则与O(n)相比，总时间复杂度为O(nlogn)。还需要额外的O(n)空间。

删除矩阵中的前2维中的重复项，matlab

2 个答案: