我有一个列表列表,如下所示:
[
[ 1, 2, 3 ],
[ 20, 30, 40, 50 ],
[ 11, 15, 17 ],
[ 20, 22, 25, 27 ],
[ 1, 5, 10 ],
[ 1, 100 ]
]
我想合并内部列表,其中一个列表中的任何元素与另一个列表中的任何元素匹配。这也需要能够处理多个重叠(因此在上面的示例中,列表中的3个将合并为一个)。所以在这种情况下,结果如下:
[
[ 1, 2, 3, 5, 10, 100 ], # 3 lists have been merged into one
[ 11, 15, 17 ], # Untouched due to no overlap
[ 20, 22, 25, 30, 40, 50 ], # 2 lists merged
]
是否有明显的算法或Perl模块可供使用?
答案 0 :(得分:4)
以下是单通道解决方案,它使用数组引用有点神奇。
对于this algorithm jaredor和其他解决方案,请查看perlmonks:how to find combine common elements of an array?
use strict;
use warnings;
use List::MoreUtils qw(uniq);
my @data = map {[split]} <DATA>;
my %group = ();
for my $array (@data) {
my @values = map {@$_} uniq map {$group{$_} || [$_]} @$array;
@group{@values} = (\@values) x @values;
}
@data = uniq values %group;
# Resort to make things pretty
@$_ = sort {$a <=> $b} @$_ for @data;
use Data::Dump;
dd @data;
__DATA__
1 2 3
20 30 40 50
11 15 17
20 22 25 27
1 5 10
1 100
输出:
(
[20, 22, 25, 27, 30, 40, 50],
[11, 15, 17],
[1, 2, 3, 5, 10, 100],
)
答案 1 :(得分:1)
只有子阵列的顺序不同,
use strict;
use warnings;
sub merge {
my ($arr) = @_;
my $i = 0;
while ($i < $#$arr) {
my $current = $arr->[$i];
my %h;
@h{@$current} = ();
my @ovlap = grep {
grep exists $h{$_}, @{$arr->[$_]}
} ($i+1) .. $#$arr;
my %seen;
@$current =
sort {$a <=> $b}
grep !$seen{$_}++,
(@$current, map @$_, @$arr[@ovlap]);
@$arr[@ovlap] = ();
@$arr = grep defined, @$arr;
$i++;
}
return $arr;
}
my $arr = [
[ 1, 2, 3 ],
[ 20, 30, 40, 50 ],
[ 11, 15, 17 ],
[ 20, 22, 25, 27 ],
[ 1, 5, 10 ],
[ 1, 100 ],
];
merge(merge($arr));
use Data::Dumper; print Dumper $arr;
输出
$VAR1 = [
[
1,
2,
3,
5,
10,
100
],
[
20,
22,
25,
27,
30,
40,
50
],
[
11,
15,
17
]
];
答案 2 :(得分:1)
我用哈希来找到解决方案。如果您的列表可能包含重复的成员,则此解决方案会将其数量减少到1。
%shared表示列表共享的号码。然后,虽然有共享号码,但您合并列表(即您更改%shared中的信息)。一旦没有共享号码,您就可以从哈希中创建列表。
#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;
my @lists = (
[ 1, 5, 10 ],
[ 10, 15, 17 ],
[ 20, 22, 25, 27 ],
[ 20, 30, 40, 50 ],
[ 1, 2, 3 ],
[ 1, 100 ],
);
my %shared;
for my $i (0 .. $#lists) {
undef $shared{$_}{$i} for @{ $lists[$i] };
}
while (my ($num) = grep 1 < keys %{ $shared{$_} }, keys %shared) {
my @to_merge = keys %{ $shared{$num} };
my $list = shift @to_merge;
for my $merge (@to_merge) {
print "Merging list $merge to $list\n";
for my $h (values %shared) {
if (exists $h->{$merge}) {
delete $h->{$merge};
undef $h->{$list};
}
}
}
}
my %left;
undef $left{ (keys %{ $shared{$_} })[0] }{$_} for keys %shared;
my @result = map [ keys %$_ ], values %left;
print Dumper \@result;
答案 3 :(得分:0)
我的两便士值得。使用简单的循环和缓存来存储结果列表中每个元素的位置,以便每次都保存搜索。
#! /usr/bin/perl
use warnings;
use strict;
use Data::Dumper;
my $lol = [
[ 1, 2, 3 ],
[ 20, 30, 40, 50 ],
[ 11, 15, 17 ],
[ 20, 22, 25, 27 ],
[ 1, 5, 10 ],
[ 1, 100 ]
];
my @results = ();
my %resultCache;
sub elementInResults {
my ($element) = $_[0];
# return value for cache, or search if not in cache
if (!defined $resultCache{$element} ) {
# search for target in destination arrays
for (my $destIndex = 0; $destIndex < @results; $destIndex++) {
if (grep (/$element/, @{$results[$destIndex]}) > 0 ) {
$resultCache{$element} = $destIndex;
last;
}
}
}
return $resultCache{$element};
}
my $srcCount=0;
# loop through source arrays
for my $srcList (@$lol) {
my $destIndex ;
# loop through elements of array
for (my $srcElementIndex=0; $srcElementIndex < @$srcList; $srcElementIndex++) {
$destIndex = elementInResults($srcList->[$srcElementIndex]);
if (defined $destIndex ) {
# element exists in an existing result array so merge
print "Merging source array $srcCount into result array $destIndex, match on:" . $srcList->[$srcElementIndex] . "\n";
# remove the duplicate element from src list first
splice(@$srcList,$srcElementIndex,1);
# then merge into dest list
push (@{$results[$destIndex]}, @$srcList);
last;
}
$srcElementIndex++;
}
# if no elements in list found in existing results add list as new one to results
push (@results, $srcList) if (!defined $destIndex ) ;
$srcCount++;
}
map {@$_ = sort ({$a <=> $b} @$_)} @results;
print Dumper \@results;