Question

如何从数组中删除任何重复的数字。

例如：

b =[ 1 1 2 3 3 5 6]

变为

b =[ 2 5 6]

Answer 1

使用unique函数提取唯一值，然后计算唯一值的数据直方图，并保留计数为1的那些。

a =[ 1 1 2 3 3 5 6];
u = unique(a)
idx = hist(a, u) ==1;
b = u(idx)

结果

  2 5 6

对于多列输入，可以这样做：

a = [1 2; 1 2;1 3;2 1; 1 3; 3 5 ; 3 6; 5 9; 6 10] ;
[u ,~, uid] = unique(a,'rows');
idx = hist(uid,1:size(u,1))==1;
b= u(idx,:)

Answer 2

您可以先对元素进行排序，然后删除与其邻居之一具有相同值的所有元素，如下所示：

A_sorted = sort(A); % sort elements
A_diff = diff(A_sorted)~=0; % check if element is the different from the next one 
A_unique = [A_diff true] & [true A_diff]; % check if element is different from previous and next one
A = A_sorted(A_unique); % obtain the unique elements.

<强>基准

我将使用其他提供的解决方案对我的解决方案进行基准测试，即：

使用diff（我的解决方案）
using hist（rahnema1）
using sum（Jean Logeart）
使用unique（我的替代解决方案）

我将使用两种情况：

小问题（你的）：A = [1 1 2 3 3 5 6];

更大的问题

rng('default');
A= round(rand(1, 1000) * 300);

结果：

                  Small        Large       Comments
----------------|------------|------------%----------------
 using `diff`   | 6.4080e-06 | 6.2228e-05 % Fastest method for large problems
 using `unique` | 6.1228e-05 | 2.1923e-04 % Good performance
 using `sum`    | 5.4352e-06 | 0.0020     % Only fast for small problems, preserves the original order
 using `hist`   | 8.4408e-05 | 1.5691e-04 % Good performance

我的解决方案（使用diff）是解决更大问题的最快方法。 Jean Logeart使用sum的解决方案对于小问题更快，但对于较大问题来说是最慢的方法，而对于小问题我的方法几乎同样快。

结论：一般来说，我使用diff提出的解决方案是最快的方法。

timeit(@() usingDiff(A))
timeit(@() usingUnique(A))
timeit(@() usingSum(A))
timeit(@() usingHist(A))

function A = usingDiff (A)
    A_sorted = sort(A);
    A_unique = [diff(A_sorted)~=0 true] & [true diff(A_sorted)~=0];
    A = A_sorted(A_unique);
end

function A = usingUnique (A)
    [~, ia1] = unique(A, 'first');
    [~, ia2] = unique(A, 'last');
    A = A(ia1(ia1 == ia2));
end

function A = usingSum (A)
    A = A(sum(A==A') == 1);
end

function A = usingHist (A)
    u = unique(A);
    A = u(hist(A, u) ==1);
end

消除/删除数组Matlab中的重复项

2 个答案: