Question

我有数据（数字M x N，n> 2）到达时按第一列排序，然后按第二列排序。有谁知道一个有效的算法，将数据转换为第二列然后第一列排序？显然，sortrows（数据，[2,1]）可以解决这个问题，但我正在寻找能够利用输入数据的现有结构以获得更快速度的东西，因为M非常大。

此外，前两列中的数据是一组已知的整数（每个整数都小于M）。

Answer 1

根据MATLAB R2010b的帮助文档，函数SORTROWS使用quicksort的稳定版本。从stable sorting algorithms "maintain the relative order of records with equal keys"开始，您可以通过简单地使用已排序的数据来实现您想要的第二列：

data = sortrows(data,2);

此结果将保持第一列中元素的相对顺序，以便数据首先按第二列排序，然后排序第一列。

Answer 2

由于第一列中的数据已经排序，因此您无需再次对其进行排序。如果你这样做会稍快一点：

>> d = rand(10000,2);  d = round(d*100);  d = sortrows(d,1);
>> tic; a1 = sortrows(d, 2); toc;
Elapsed time is 0.006805 seconds.

对战：

>> tic; a2 = sortrows(d, [2 1]); toc;
Elapsed time is 0.010207 seconds.
>> isequal(a1, a2)

ans =

     1

Answer 3

我不停地敲打着它，但是无法比sortrows方法更快。这利用了每对密钥都是唯一的事实，我上面没有提到过。

% This gives us unique rows of integers between one and 10000, sorted first
% by column 1 then 2.
x = unique(uint32(ceil(10000*rand(1e6,2))),'rows');

tic;
idx = zeros(size(x,1),1);
% Work out where each group of the second keys will start in the sorted output.
StartingPoints = cumsum([1;accumarray(x(:,2),1)]);
% Work out where each group of the first keys is in the input.
Ends = find([~all(diff(x(:,1),1,1)==0,2);true(1,1)]);
Starts = [1;Ends(1:(end-1))+1];
% Build the index.
for i = 1:size(Starts)
    temp = x(Starts(i):Ends(i),2);
    idx(StartingPoints(temp)) = Starts(i):Ends(i);
    StartingPoints(temp) = StartingPoints(temp) + 1;
end
% Apply the index.
y = x(idx,:);
toc

tic;
z = sortrows(x,2);
toc

isequal(y,z)

我的算法给出0.21秒，第二秒给出0.18秒（不同随机种子的稳定性）。

如果有人看到任何进一步加速（除了mex），请随时添加。

使用sortrows更改列顺序的快速MATLAB方法

3 个答案: