Question

我有以下矩阵跟踪数据范围的起点和终点（第一列代表"starts"，第二列代表"ends"）：

myMatrix = [
    162   199; %// this represents the range 162:199
    166   199; %// this represents the range 166:199
    180   187; %// and so on...
    314   326;
    323   326;
    397   399;
    419   420;
    433   436;
    576   757;
    579   630;
    634   757;
    663   757;
    668   757;
    676   714;
    722   757;
    746   757;
    799   806;
    951   953;
    1271  1272
];

我需要消除矩阵中存在的更大范围内的所有范围（即行）。例如，范围[166:199]和[180:187]包含在[162:199]范围内，因此需要删除第2行和第3行。

我想到的解决方案是计算一种＆＃34;运行＆＃34;第二列上的max，比较列的后续值，以确定是否需要删除它们。我使用for循环实现了这一点，如下所示：

currentMax = myMatrix(1,2); %//set first value as the maximum
[sizeOfMatrix,~] = size(myMatrix); %//determine the number of rows
rowsToRemove = false(sizeOfMatrix,1); %//pre-allocate final vector of logicals
for m=2:sizeOfMatrix
    if myMatrix(m,2) > currentMax %//if new max is reached, update currentMax...
        currentMax = myMatrix(m,2);
    else
        rowsToRemove(m) = true; %//... else mark that row for removal
    end
end
myMatrix(rowsToRemove,:) = [];

这正确地删除了＆＃34;冗余＆＃34;范围为myMatrix并生成以下矩阵：

myMatrix =
         162         199
         314         326
         397         399
         419         420
         433         436
         576         757
         799         806
         951         953
        1271        1272

问题：

1）似乎必须有一种更好的方法来计算＆＃34;运行＆＃34; max而不是for循环。我查看了accumarray和filter，但无法找到使用这些功能的方法。是否存在跳过for循环的潜在替代方案（某种更有效的矢量化代码）？

2）是否有完全不同（即更有效）的方法来实现删除myMatrix中较大范围内包含的所有范围的最终目标？我不知道我是否过度思考这一切......

Answer 1

方法＃1

bsxfun基于蛮力的方法 -

myMatrix(sum(bsxfun(@ge,myMatrix(:,1),myMatrix(:,1)') & ...
    bsxfun(@le,myMatrix(:,2),myMatrix(:,2)'),2)<=1,:)

对拟议解决方案的解释很少：

将所有starts索引相互比较为“contains-ness”，并类似地将ends索引进行比较。请注意，“包含”标准必须适用于以下两者之一：
- 大于或等于starts且小于或等于ends
- 小于或等于starts且大于或等于ends。
我恰好是第一个选择。
查看哪些行至少满足一个“contains-ness”并删除那些行以获得所需的结果。

方法＃2

如果您对根据第一列排序行的输出没有问题，并且local max's的数量较少，您可以尝试这种替代方法 -

myMatrix_sorted = sortrows(myMatrix,1);
col2 = myMatrix_sorted(:,2);
max_idx = 1:numel(col2);
while 1
    col2_selected = col2(max_idx);
    N = numel(col2_selected);
    labels = cumsum([true ; diff(col2_selected)>0]);
    idx1 = accumarray(labels, 1:N ,[], @(x) findmax(x,col2_selected));
    if numel(idx1)==N
        break;
    end
    max_idx = max_idx(idx1);
end
out = myMatrix_sorted(max_idx,:); %// desired output

计算向量的“运行”最大值

2 个答案:

方法＃1

方法＃2