如何在Perl中创建小于n的所有子集?

时间:2010-11-07 16:37:54

标签: perl recursion set

我有一套套装。我想创建从每个原始集合中最多获取一个元素的所有集合。 例如,如果我的原始集合为((x,y),(A),(1,2)),则解决方案为:

(x)
(y)
(A)
(1)
(2)
(x,A)
(x,1)
(x,2)
(y,A)
(y,A)
(y,1)
(y,2)
(A,1)
(A,2)
(x,A,1)
(x,A,2)
(y,A,1)
(y,A,2)

我使用下面编写的代码来递归计算:

# gets an array of arrays (aoa)
# returns an array of arrays with all subsets where zero or one element is
# taken from each array, e.g. in = [[a,b],[5],[X,Y,Z]], out =
# [[],[a],[b],[5],[X],[Y],[Z],[a,5],[b,5],[a,X],[a,Y],...,[b,5,Y],[b,5,Z]]
# note the order of elelemnts in each arry is immaterial (an array is
# considered an unordered set)
sub sets_aoa_to_subsets_aoa {
    my $aoa = shift // confess;

    if ( scalar( @{$aoa} ) == 0 ) {
        return [ [] ];
    }

    my $a           = shift @{$aoa};
    my $subsets_aoa = sets_aoa_to_subsets_aoa($aoa);
    my @new_subsets = ();
    foreach my $subset_a ( @{$subsets_aoa} ) {

        # leave subset as-is
        push @new_subsets, $subset_a;

        # add one element from $a
        foreach my $e ( @{$a} ) {
            push @new_subsets, [ $e, @{$subset_a} ];
        }
    }
    return \@new_subsets;

}

但是,我想对子集的大小添加限制。例如,如果我设置max_size=2,那么将忽略最后四个解决方案。我不能简单地生成所有解决方案,然后过滤那些太大的,因为有时我有超过100套,每个有2-3个元素,而2 ^ 100不是一个很好的数字来处理,特别是当我只想要子集5号或更小。

4 个答案:

答案 0 :(得分:2)

正如我所怀疑的,正则表达式适用于此。

特定解决方案

这是问题的具体解决方案。有80个答案。

my %seen;

"xy=a=12" =~ m{
        [^=]* (x|y)* [^=]*
    =
        [^=]* (a)*   [^=]*
    =
        [^=]* (1|2)* [^=]*

    (?{ 
         my $size = grep { length } $1, $2, $3;
         print "<$1> <$2> <$3>\n"
            if $size >= 1 && 
               $size <= 2 &&
             ! $seen{$1,$2,$3}++;
    })
    (*FAIL)
}x;

将管道传输到cat -n,您将看到80个答案。

当然,你会想要一些可以推广和扩展的东西,这样你就可以将它应用到你的一百套情境中。制作一般解决方案总是比特定解决方案花费更长时间,因此我将继续进行这种概括,并在看起来很漂亮时尽快回复。

一般解决方案

这是一般解决方案;这不是我最漂亮的工作,但它 工作:

#!/usr/bin/perl

use 5.010;
use strict;
use warnings;

our($MIN_PICK, $MAX_PICK) = (1, 2);

our @List_of_Sets = (
    [ qw[ x y ] ],
    [ qw[ a   ] ],
    [ qw[ 1 2 ] ],
);

sub dequeue($$) {
    my($leader, $body) = @_;
    $body =~ s/^\s*\Q$leader\E ?//gm;
    return $body;
}

################################

my $gunk     = " (?&gunk) ";
my $alter_rx = join("\n\t(?&post)\n" => map {
                  " $gunk ( "
                . join(" | " => map { quotemeta } @$_)
                . " ) * $gunk "
              } @List_of_Sets);
##print "ALTER_RX <\n$alter_rx\n>\n";

my $string = join(" = ", map { join(" ", @$_) } @List_of_Sets);
##print "STRING: $string\n";

my $numbers_list    = join(", " => map {  '$' . $_        } 1 .. @List_of_Sets);
my $numbers_bracket = join(" "  => map { '<$' . $_  . '>' } 1 .. @List_of_Sets);

my $print_statement = dequeue "|QQ|" => <<"PRINT_STATEMENT";

    |QQ|
    |QQ|    (?{
    |QQ|        no warnings qw(uninitialized);
    |QQ|        my \$size = grep { length } $numbers_list;
    |QQ|        print "$numbers_bracket\\n"
    |QQ|            if \$size >= $MIN_PICK &&
    |QQ|               \$size <= $MAX_PICK &&
    |QQ|             ! \$seen{$numbers_list}++;
    |QQ|    })
    |QQ|

PRINT_STATEMENT   
## print "PRINT $print_statement\n";

my $search_rx = do {
    use re "eval";
    my %seen;
    qr{
        ^
    $alter_rx
        $

    $print_statement

        (*FAIL)

        (?(DEFINE)
            (?<post>   =    )
            (?<gunk> [^=] * )
        )
    }x;
};
## print qq(SEARCH:\n"$string" =~ $search_rx\n);

# run, run, run!!
$string =~ $search_rx;

在某种程度上关注你希望从中获得的可能性。可能你应该将我上面概述的这个过程放在管道的另一端,以便你可以从中读取你想要的多少,然后挂断电话,可以这么说,当你有你的填充

我意识到这是一个相当不寻常的解决方案;我的代码经常是。 :)

我只是想你也可以让正则表达式回溯的详尽排列性为你做的工作。

也许其他人会拉出Some::Abstruse::Module为你做这份工作。你只需要权衡你喜欢的东西。

编辑:提高易读性,处理重复项和额外的最低/最高标准。

答案 1 :(得分:2)

也是递归解决方案,但是传递子集内置的sofar,因此您可以在达到最大大小时立即停止。

#!/opt/perl/bin/perl

use strict;
use warnings;
use 5.010;

sub subsets
{
    my ($sets, $maxSize, $subset) = @_;
    $subset //= [ ];

    # If we already have $maxSize elements, we're done
    return ($subset) if @$subset == $maxSize;

    # If we have no sets left to pick from, we're done
    return ($subset) if !@$sets;

    # Consider the next set
    my @remainingSets = @$sets;
    my $nextSet = shift(@remainingSets);

    # We can choose either 0 or 1 element from this set, continue with the rest
    return (subsets(\@remainingSets, $maxSize, $subset),
            map { subsets(\@remainingSets, $maxSize, [@$subset, $_]) }
                @$nextSet);
}

my $sets = [ [qw(x y)], [qw(A)], [qw(1 2)] ];
my @subsets = subsets($sets, 2);

foreach my $subset (@subsets) {
    say '(', join(', ', @$subset), ')';
}

答案 2 :(得分:0)

你可以创建一个“状态变量”来跟踪对sets_aoa_to_subsets_aoa的调用次数,然后在你的treminal条件中检查它:

{
    my $count=0;
    sub sets_aoa_to_subsets_aoa {
        $count++;
        my ($aoa,$number_of_calls) = @_ // confess;
    if ( (scalar( @{$aoa} ) == 0) or ($count == $number_or_calls)) {
            return [ [] ];
        }
    ......
    }
    }

答案 3 :(得分:0)

    foreach my $e ( @{$a} ) {
        push @new_subsets, [ $e, @{$subset_a} ];
    }

简单地传递$items_wanted参数,并跳过突出显示的代码位@{$subset_a} > $items_wanted。由于上面的行已生成所有不添加其他项的组合,因此无需进一步更改即可生效。