通过迭代perl中的数组进行数据处理

时间:2015-09-17 12:42:11

标签: arrays perl

我有这个脚本:

use strict;
use warnings;
use diagnostics;
use Math::Vector::Real;
use constant DEG_PER_RAD => 45 / atan2(1, 1);

my ( $source, $out ) = qw/ OUT4 OUTABA12 /;

open my $in_fh,  '<', $source or die qq{Unable to open "$source" for input: $!\n};
open my $out_fh, '>', $out    or die qq{Unable to open "$out" for output: $!\n};

my @data;
push @data, V(split) while <$in_fh>;

my @aoa;
for my $i ( 0 .. $#data ) {
    for my $j ( 0 .. $#data ) {
        my $val1 = $data[$i];
        my $val2 = $data[$j];

        if ($val1 != $val2) {
            my $math = sqrt(($val1->[0] - $val2->[0])**2 +
                ($val1->[1] - $val2->[1])**2 +
                ($val1->[2] - $val2->[2])**2);
            if ($math < 2.2) {
                    push @aoa, [@$val1, @$val2, $math];
            }
        }
    }
}

for my $k ( 0 .. $#aoa-1 ) {
    my $aoadata1 = $aoa[$k];
    my $aoadata2 = $aoa[$k+1];
    my $vect1 = [ @{ $aoa[$k] }[0..2] ];
    my $vect2 = [ @{ $aoa[$k+1] }[0..2] ];
    my $vect3 = [ @{ $aoa[$k] }[3..5] ];
    my $vect4 = [ @{ $aoa[$k+1] }[3..5] ];
    my $math1 = [ @{ $aoa[$k] }[6] ];
    my $math2 = [ @{ $aoa[$k+1] }[6] ];
    my @matha = @$math1;
    my @mathb = @$math2;
    my @vecta = @$vect1;
    my @vectb = @$vect2;
    my @vectc = @$vect3;
    my @vectd = @$vect4;
            if ( @vecta != @vectb ) {
                print "180\n";
            }
}

从这个输入文件开始,一个坐标列表:

18.474525 20.161419 20.33903
21.999333 20.220667 19.786734
18.333228 21.649157 21.125111
20.371077 19.675844 19.77649
17.04323 19.3106 20.148842
22.941106 19.105412 19.069893

然后生成这个中间数组:

18.474525 20.161419 20.33903 18.333228 21.649157 21.125111 1.68856523042908
18.474525 20.161419 20.33903 20.371077 19.675844 19.77649 2.03694472701863
18.474525 20.161419 20.33903 17.04323 19.3106 20.148842 1.67590865596249
21.999333 20.220667 19.786734 20.371077 19.675844 19.77649 1.71701911532778
21.999333 20.220667 19.786734 22.941106 19.105412 19.069893 1.62621988606553
18.333228 21.649157 21.125111 18.474525 20.161419 20.33903 1.68856523042908
20.371077 19.675844 19.77649 18.474525 20.161419 20.33903 2.03694472701863
20.371077 19.675844 19.77649 21.999333 20.220667 19.786734 1.71701911532778
17.04323 19.3106 20.148842 18.474525 20.161419 20.33903 1.67590865596249
22.941106 19.105412 19.069893 21.999333 20.220667 19.786734 1.62621988606553

是一对坐标,后面是该列表中的另一对坐标,后跟它们之间的距离。

我一直试图做的是让程序的后半部分工作 - 但我不知道如何像我需要的那样完成数组工作。

如果中间数组中有一行第一组坐标对于所有其他行是唯一的,那么它应该只打印该行中的第一个坐标集和180.例如数组中的最后一行 - 运行

print "@vecta 180\n";

应返回,为该行:

22.941106 19.105412 19.069893 180

否则,对于中间数组的每一行,我想看看一行的前3个坐标是否与第二行的前三个坐标匹配,如果是,我需要从第二个坐标中取出第二个坐标。两行,从它们中减去两条线中每条线上的第一组相同坐标,然后在减法后找到两个二级坐标之间的角度。类似于此的东西:

my $varvec1 = V( @$vect3 );
my $varvec2 = V( @$vect4 );
my $varnorm = V( @$vect1 );
my $nvect1 = $varvec1 - $varnorm ;
my $nvect2 = $varvec2 - $varnorm ;
my $degrees = atan2($nvect1, $nvect2) * DEG_PER_RAD;
print "$varnorm $degrees\n";

在中间件的前3行上运行它应该返回:

18.474525 20.161419 20.33903 *Some Value* 
18.474525 20.161419 20.33903 *Some Value* 
18.474525 20.161419 20.33903 *Some Value* 

Some Value 是从第一行和第二行之间发生的上述角度计算得出的,第一行和第三行,然后是第二行和第三行。

总的来说,该计划最好能给我:

 18.474525 20.161419 20.33903 *Some Value*
 18.474525 20.161419 20.33903 *Some Value*
 18.474525 20.161419 20.33903 *Some Value*
 21.999333 20.220667 19.786734 *Some Value*
 21.999333 20.220667 19.786734 *Some Value*
 18.333228 21.649157 21.125111 180
 20.371077 19.675844 19.77649 *Some Value*
 20.371077 19.675844 19.77649 *Some Value*
 17.04323 19.3106 20.148842 180
 22.941106 19.105412 19.069893 180

我的主要问题是正确运行数据,我可以设置计算和变量赋值。任何人都可以帮我解决这个问题吗?提前致谢。

编辑 - 对Sobrique的回应

当我实施第二部分(我假设实现了第一部分)时,还有其他建议:

my %seen;

for my $index ( 0 .. $#aoa ) {
    my $coord_key = join( ":", @{ $aoa[$index] }[ 0 .. 2 ] );
    if ( $seen{$coord_key} <= 1 ) {
        print V( @{$aoa[$index]}[0..2] ) . " 180\n";
    }
    else {
        last unless $aoa[ $index + 1 ]; #in case out of bounds
        my $varvec1 = V( @{ $aoa[$index] }[ 3 .. 5 ] );
        my $varvec2 = V( @{ $aoa[ $index + 1 ] }[ 3 .. 5 ] );
        my $varnorm = V( @{ $aoa[$index] }[ 0 .. 2 ] );
        my $nvect1  = $varvec1 - $varnorm;
        my $nvect2  = $varvec2 - $varnorm;
        my $degrees = atan2( $nvect1, $nvect2 ) * DEG_PER_RAD;
        print "$varnorm $degrees\n";

    }

我明白了:

Use of uninitialized value within %seen in numeric le (<=) at
    /Users/a7c/exe/distscript.pl line 64, <$in_fh> line 6 (#1)
    (W uninitialized) An undefined value was used as if it were already
    defined.  It was interpreted as a "" or a 0, but maybe it was a mistake.
    To suppress this warning assign a defined value to your variables.

    To help you figure out what was undefined, perl will try to tell you
    the name of the variable (if any) that was undefined.  In some cases
    it cannot do this, so it also tells you what operation you used the
    undefined value in.  Note, however, that perl optimizes your program
    anid the operation displayed in the warning may not necessarily appear
    literally in your program.  For example, "that $foo" is usually
    optimized into "that " . $foo, and the warning will refer to the
    concatenation (.) operator, even though there is no . in
    your program.

{18.474525, 20.161419, 20.33903} 180
{18.474525, 20.161419, 20.33903} 180
{18.474525, 20.161419, 20.33903} 180
{21.999333, 20.220667, 19.786734} 180
{21.999333, 20.220667, 19.786734} 180
{18.333228, 21.649157, 21.125111} 180
{20.371077, 19.675844, 19.77649} 180
{20.371077, 19.675844, 19.77649} 180
{17.04323, 19.3106, 20.148842} 180
{22.941106, 19.105412, 19.069893} 180

你肯定是正确解释的。我不太确定如何删除非数字LE问题,但最终结果似乎因为它而搞砸了。

2 个答案:

答案 0 :(得分:2)

所以第一部分:

  

如果中间数组中有一行第一组坐标与其他所有行都是唯一的,那么它应该只打印该行中的第一个坐标集和180.例如数组中的最后一行

我有点懒,并假设join:就足够了。它应该是纯数字的东西。否则,您可以实现更具体的数组等效性测试。

use Data::Dumper;
print Dumper \@aoa;

my %seen;

foreach my $row (@aoa) {
    my $coord_key = join (":", @$row[0..2] );
    print $coord_key,"\n";
    $seen{$coord_key}++; 
}

foreach my $row ( @aoa ) { 
   my $coord_key = join (":", @$row[0..2] );
   print "Unique:  @$row[0..2] 180\n" unless $seen{$coord_key} > 1;
}

这将吐出:

Unique:  18.333228 21.649157 21.125111 180
Unique:  17.04323 19.3106 20.148842 180
Unique:  22.941106 19.105412 19.069893 180

我认为这是理想的结果?

对于第二部分 - 我再次迷路,但我认为你应该能够做类似的事情。

或许这样的事情?

for my $index ( 0 .. $#aoa ) {
    my $coord_key = join( ":", @{ $aoa[$index] }[ 0 .. 2 ] );
    if ( $seen{$coord_key} <= 1 ) {
        print "Unique:  @{$aoa[$index]}[0..2] 180\n";
    }
    else {
        last unless $aoa[ $index + 1 ]; #in case out of bounds
        my $varvec1 = V( @{ $aoa[$index] }[ 3 .. 5 ] );
        my $varvec2 = V( @{ $aoa[ $index + 1 ] }[ 3 .. 5 ] );
        my $varnorm = V( @{ $aoa[$index] }[ 0 .. 2 ] );
        my $nvect1  = $varvec1 - $varnorm;
        my $nvect2  = $varvec2 - $varnorm;
        my $degrees = atan2( $nvect1, $nvect2 ) * DEG_PER_RAD;
        print "$varnorm $degrees\n";

    }
}

给出了:

{18.474525, 20.161419, 20.33903} 114.61436195896
{18.474525, 20.161419, 20.33903} 130.002084392181
{18.474525, 20.161419, 20.33903} 130.002084392181
{21.999333, 20.220667, 19.786734} 109.204553032216
{21.999333, 20.220667, 19.786734} 128.968855749228
Unique:  18.333228 21.649157 21.125111 180
{20.371077, 19.675844, 19.77649} 143.673572115941
{20.371077, 19.675844, 19.77649} 143.673572115941
Unique:  17.04323 19.3106 20.148842 180
Unique:  22.941106 19.105412 19.069893 180

您似乎在打印$varnorm,其中包含{}。你可以修改:

print "Unique:  @{$aoa[$index]}[0..2] 180\n";

要:

print V( @{$aoa[$index]}[0..2] ) . " 180\n";

或者:

print "$varnorm $degrees\n";

print "@{ $aoa[$index] }[ 0 .. 2 ] $degrees\n";

或者也许可以将它们与join ( "\t",一起制成表格或其他内容。

编辑:对于数组比较 - 我建议的方法非常基本,并且会被一些东西绊倒。

您可以改为执行以下操作:

foreach my $row (@aoa) {
    $seen{$row->[0]}{$row->[1]}{$row->[2]}++;
}

if ( $seen{$aoa[$index][0]}{$aoa[$index][1]}{$aoa[$index][2]} <= 1 ) { 
       #....
}

但是这也会对浮点数应用数值测试 - 这可能会因浮点精度而导致问题。如果确实会导致问题,那么sprintf将浮点数字串化为已知精度可能会有所帮助。

答案 1 :(得分:1)

这是一个重复的帖子,我错误地添加到您的原始问题而不是这个新问题。我认为这是正确的

我使用了Math::Vector::Real模块的功能,如我在评论中所述

  • 我将矢量作为对象,并避免直接访问其内容

  • 我将$vec1$vec2之间的距离计算为abs($vec2 - $vec1)

  • 我使用该课程的 stringify 功能来显示它,而不是在我的代码中提取单个值

我也改变了中间数据格式。距离不再保留,因为它不是必需的,并且数组@groups现在包含具有共同第一向量的每组向量对的数组。每个组的格式为

[ $vec1, $vec2, $vec2, $vec2, ... ]

我使用List::Util中的first函数来查找每个新向量对所属的组。如果找到一个匹配的第一个值的现有组,那么第二个向量就被推到小组结束;否则会创建一个看起来像[ $vec1, $vec2 ]

的新组

构建@groups数组后,将再次处理它以生成输出

  • 如果组中只有两个值,则它们是唯一点的$vec1$vec2$vec1打印有180

  • 如果有两个以上的元素,则为每对$vec2值生成一行输出,每个值包含值$vec1以及由此形成的两个delta之间的角度$vec1以及对中的每个$vec2


use strict;
use warnings;

use Math::Vector::Real qw/ V /;
use List::Util qw / first /;

use constant DEG_PER_RAD => 45 / atan2(1, 1);

my ( $source, $out ) = qw/ OUT4  OUTABA12 /;

open my $in_fh,  '<', $source or die qq{Unable to open "$source" for input: $!\n};

my @data = map V(split), <$in_fh>;

my @groups;

for my $vec1 ( @data ) {
    for my $vec2 ( @data ) {

        next if abs($vec2 - $vec1) > 2.2 or $vec2 == $vec1;

        my $group = first { $_->[0] == $vec1 } @groups;

        if ( $group ) {
            push @$group, $vec2;
        }
        else {
            push @groups, [ $vec1, $vec2 ];
        }
    }
}

open my $out_fh, '>', $out or die qq{Unable to open "$out" for output: $!\n};
select $out_fh;

for my $group ( @groups ) {

    my ($vec1, @vec2) = @$group;

    if ( @vec2 == 1 ) {
        print "$vec1 180\n";
        next;
    }

    for my $i ( 0 .. $#vec2-1 ) {
        for my $j ( $i+1 .. $#vec2 ) {
            my ($vec2a, $vec2b) = @vec2[$i, $j];
            my $angle = atan2( $vec2a - $vec1, $vec2b - $vec1 ) * DEG_PER_RAD;
            print "$vec1 $angle\n";
        }
    }

}

输出

{18.474525, 20.161419, 20.33903} 114.61436195896
{18.474525, 20.161419, 20.33903} 115.382649331314
{18.474525, 20.161419, 20.33903} 130.002084392181
{21.999333, 20.220667, 19.786734} 109.204553032216
{18.333228, 21.649157, 21.125111} 180
{20.371077, 19.675844, 19.77649} 143.673572115941
{17.04323, 19.3106, 20.148842} 180
{22.941106, 19.105412, 19.069893} 180