从关联数组中检索并删除重复值

时间:2016-06-15 13:53:59

标签: php associative-array

我有关联数组,如下所示

$arr = [1=>0, 2=>1, 3=>1, 4=>2, 5=>2, 6=>3]

我想从初始数组中删除重复值,并将这些重复项作为重复数组的新数组返回。所以我最终会得到类似的东西;

$arr = [1=>0, 6=>3] $new_arr = [[2=>1, 3=>1],[4=>2, 5=>2]]

PHP是否提供了这样的功能,或者如果不能提供这样的功能?

我试过了;

$array = [];
$array[1] = 5;
$array[2] = 5;
$array[3] = 4;
$array[5] = 6;
$array[7] = 7;
$array[8] = 7;

$counts = array_count_values($array);
print_r($counts);
$duplicates = array_filter($array, function ($value) use ($counts) {
 return $counts[$value] > 1;
});
print_r($duplicates);

$result = array_diff($array, $duplicates);
print_r($result);

输出;

[1] => 5
[2] => 5
[7] => 7
[8] => 7

&安培;

[3] => 4
[5] => 6

这几乎就是我想要的。

1 个答案:

答案 0 :(得分:0)

代码

以下对我有用......我在复杂性和性能方面没有做出任何承诺,但总的想法是......而且,我已经多年没有编写PHP了,所以记住这一点。

<?php
function nubDups( $arr ) {
    $seen = [];
    $dups = [];
    foreach ( $arr as $k => $v) {
        if ( array_key_exists( $v, $seen ) ) {
            // duplicate found!
            if ( !array_key_exists( $v, $dups ) )
                $dups[$v] = [$seen[$v]];

            $dups[$v][] = $k;
        } else 
            // First time seen, record!
            $seen[$v] = $k;
    }

    $uniques = [];
    foreach ( $seen as $v => $k ) {
        if ( !array_key_exists( $v, $dups ) ) $uniques[$k] = $v;
    }

    return [$uniques, $dups];
}

function nubDups2( $arr ) {
    for ( $seen = $dups = []; list( $k, $v ) = each( $arr ); )
        if      ( key_exists( $v, $dups ) ) $dups[$v][] = $k;
        else if ( key_exists( $v, $seen ) ) $dups[$v]   = [$seen[$v], $k];
        else                                $seen[$v]   = $k;
    return [array_flip( array_diff_key( $seen, $dups ) ), $dups];
}

$arr = [0, 1, 4, 1, 2, 2, 3];
print_r( nubDups( $arr ) );
print_r( nubDups2( $arr ) );

输出(两者)

$ php Test.php      
Array
(
    [0] => 0
    [2] => 4
    [6] => 3
)
Array
(
    [1] => Array
        (
            [0] => 1
            [1] => 3
        )

    [2] => Array
        (
            [0] => 4
            [1] => 5
        )
)

缩短

  • 已移除,指定为[(k, v)][(0, 0), (2, 4), (6, 3)]
  • 重复,指定为[(v, [k])][(1, [1, 3]), (2, [4, 5])]

在Haskell

此版本滥用哈希表以进行快速查找。 一个更简单的版本几乎完全相同但忽略索引,用haskell编写:

-- | 'nubDupsBy': for a given list yields a pair where the fst contains the
-- the list without any duplicates, and snd contains the duplicate elements.
-- This is determined by a user specified binary predicate function.
nubDupsBy :: (a -> a -> Bool) -> [a] -> ([a], [a])
nubDupsBy p = foldl f ([], [])
    where f (seen, dups) x | any (p x) seen = (seen, dups ++ [x])
                           | otherwise      = (seen ++ [x], dups)