Question

我正在尝试研究一种算法，其中，给定一个数字列表，我必须计算一个系数，该系数由数据列表中建立的三角形数量与数字所具有的最小邻居数量之间的比率给出;例如，给定文件的前两行：

1 2 3 4 5 6 9
2 1 3
...

如果行的第一个元素出现在其他行中，并且后续行的第一个元素出现在考试中的行中，那么我找到了一个链接;
如果“ link ”存在，那么我想要计算考试中考虑的行中其他元素出现在链接所在行中的次数并打印“ I找到了z三角形“。

例如，在这种情况下，当程序比较第一行和第二行并发现“链接1 2 ”存在时，发现顶点有1个三角形1,2 ，3。在算法中，我必须将三角形的数量+ 1除以每行中的最小元素数量 - 2（在这种情况下，最小数量来自第二行，值为3-2 = 1）;我正在寻找的系数是（1 + 1）/ 1 = 2;

输出文件将写为：

1 2 1

其中在前两列中我找到了构成链接的元素，在第3列中找到了系数的值;

这是我到目前为止编写的代码：

use strict;
use warnings;
use 5.010;
use List::Util;

my $filename = "data";
open my $fh, '<', $filename or die "Cannot open $filename: $!";

my $output_file = "output_example";
open my $fi, ">", $output_file or die "Error during $output_file opening: $!";

my %vector;
while (<$fh>) {
    my @fields = split;
    my $root = shift @fields;
    $vector{$root} = { map { $_ => 1} @fields };
}

my @roots = sort { $a <=> $b } keys %vector;
for my $i (0 .. $#roots) {
    my $aa = $roots[$i];
    my $n_element_a = scalar (keys %{$vector{$aa}})-1;

    for my $j ($i+1 .. $#roots) {
        my $minimum;
        my $bb = $roots[$j];
        my $n_element_b = scalar (keys %{$vector{$bb}})-1;
        next unless $vector{$aa}{$bb} and $vector{$bb}{$aa};
        if ($n_element_a < $n_element_b){
            $minimum = $n_element_a;
        }else {
            $minimum = $n_element_b;
        }

        my $triangles = 0;
        for my $cc ( keys %{$vector{$aa}} ) {
            next if $cc == $aa or $cc == $bb;
            if ($vector{$bb}{$cc}) {
                $triangles++;
            }
        }

        my $coeff;
        my @minimum_list;           
        if ($minimum == 0){
            $coeff = ($triangles +1)/($minimum+0.00000000001);
        } else {
            $coeff = ($triangles +1)/($minimum);
        }
        say $fi "$aa $bb $coeff";
    }
}
__DATA__
1 2 3 4 5 6 9
2 1 3
3 1 2
4 1 5
5 1 4
6 1 7 8
8 6 7
9 1 10 11
10 9 11 12 14
11 9 10 12 13
12 10 13 14
13 11 12
14 10 12 15
15 14

我把整个数据集放在最后。输出文件给出：

__OUTPUT__
1 2 2
1 3 2
1 4 2
1 5 2
1 6 0.5
1 9 0.5
2 3 2
4 5 2
6 8 2
9 10 1
9 11 1
10 11 1
10 12 1
10 14 1
11 13 2
12 13 1
12 14 1
14 15 100000000000

现在我想找到系数的最小值，识别出现这个较低值的链接，删除原始数据集中的这些元素，并在“新”数据集上重复相同的程序。

例如，在这种情况下，显示最小值的链接是1 6和1 9，系数为0.5。所以现在程序应该在文件数据中删除以“1”开头的行中的元素“6”，反之亦然，并且与9相同。所以现在“新”数据集将是：

1 2 3 4 5
2 1 3
3 1 2
4 1 5
5 1 4
6 7 8
8 6 7
9 10 11
10 9 11 12 14
11 9 10 12 13
12 10 13 14
13 11 12
14 10 12 15
15 14

我要找的是：

如何从data文件中包含的数据集中删除显示最小系数值的元素？
如何迭代进程N次？

要从输出文件中找到最小值，我想在程序末尾添加这些行：

my $file1 = "output_example";
open my $fg,  "<", $file1 or die "Error during $file1 opening: $!";

my @minimum_vector;
while (<$fg>) {
    push @minimum_vector, [ split ];
}

my $minima=$minimum_vector[0][2];
for my $i (0 .. $#minimum_vector){
    if($minima >= $minimum_vector[$i][2] ){
        $minima=$minimum_vector[$i][2];
    }
}
say $minima;
close $file1;

但它给了$minima一个错误，因为我认为我无法读取我刚刚创建的同一个文件（在本例中是output_example文件）。如果我在不同的程序中编译，它就会运行。

Answer 1

迭代的最佳方法可能是将代码分解为子例程。这也有助于澄清代码并准确追踪可能出现的问题。

use strict;
use warnings;
use 5.010;
use List::Utils qw/min/;

sub load_initial_data {
    # open and read file, load it into an arrayref and return it.
}

sub find_coefficients {
    my $data = shift;
    my @results;
    foreach my $row (@$data) {
        # do stuff to calculate $aa, $bb, $coeff
        push @results, [$aa, $bb, $coeff];
    }
    return \@results;
}

sub filter_data {
    my $data = shift;
    my $coefficients = shift;
    my $min = min map { $_->[2] } @$coefficients;
    my @deletions = grep { $min == $_->[2] } @$coefficients;
    foreach my $del (@deletions) {
        delete( $data->{$del->[0]}{$del->[1]} );
    }
}

# doing the actual work:
my $data = load_initial_data(); 
my $coeffs;
foreach my $pass (0 .. $N) {
    $coeffs = find_coefficients( $data );
    $data = filter_data( $data, $coeffs );
    # You could write $data and/or $coeffs out to a file here
    # if you need to keep the intermediate stages
}

迭代删除原始集中数据的程序

1 个答案: