Question

我有一个愚蠢的问题。我有2个文件：文件A包含9行或更多行。

life_1,23032018,a_0300,true,21
life_1,23032018,a_0200,true,21
life_1,23032018,a_0100,true,20
life_1,23032018,c_0300,true,21
life_1,23032018,c_0200,true,21
life_1,23032018,c_0100,true,25
life_1,23032018,d_0300,true,23
life_1,23032018,d_0200,true,21
life_1,23032018,d_0100,true,24

文件B包含800行或更多行。

201810021569661,23032018
201810021569678,23032018
201810021569685,23032018
201810021569708,23032018
201810021569715,23032018
201810021569722,23032018
201810021569739,23032018
201810021569746,23032018
201810021569753,23032018
201810021569760,23032018

我在文件A中使用perl并使用第5列来运行文件B的行数并创建第三个文件。键是两个文件中的第二列（23032018）。

life_1,201810021569661,a_0300,true
life_1,201810021569678,a_0300,true
life_1,201810021569685,a_0300,true
life_1,201810021569708,a_0300,true
life_1,201810021569715,a_0300,true
life_1,201810021569722,a_0300,true
life_1,201810021569739,a_0300,true
life_1,201810021569746,a_0300,true
life_1,201810021569753,a_0300,true
life_1,201810021569760,a_0300,true
.
.
.
life_1,201810021569661,a_0200,true and so on.

但是我在Perl中执行了这个脚本，但它总是打印文件B的最后一个位置而不是第二个文件的每个位置。

$inputfile1 = "FileA";
$inputfile2 = "FileB";

open ( IN1, '<', $inputfile1) || die ( "File $inputfile1 Not Found!" );
open ( IN2, '<', $inputfile2) || die ( "File $inputfile2 Not Found!" );
my %hash;


while ( <IN2> ) {
    chomp;
    my @col1 = split ",";
    $hash{$col1[1]} = $col1[0];
}

while ( <IN1> ) {
    chomp;
    my @col2 = split ",";
    #print $col2[4] . "\n";

        if ( exists( $hash{$col2[1]} ) ) {
                for (my $i=1; $i <= $col2[4]; $i++){
                        print $col2[0] . "," . $hash{$col2[1]} . "," . $col2[2] . "," . $col2[3] . "," . $i . "\n";
                }
        }
}

有人可以帮忙解决这个问题吗？

Answer 1

你已经尝试过自己和足够近的地方，所以我决定帮忙。

首先：使用$hash{$col1[1]} = $col1[0]，每次都覆盖哈希值。您希望将其添加到连接到密钥的阵列。数组的哈希。

第二：当你命名文件A和B，然后命名文件处理程序IN1和IN2，最后命名数组@col2和{{1}时，它让我（和其他人+你自己？）感到困惑数字顺序相反。良好的命名使代码更具可读性。

@col1

第三：使用字段名而不是数组索引的“幻数”使其更具可读性。

open( A, '<', "FileA") or die "FileA not found!";
open( B, '<', "FileB") or die "FileB not found!";
my %hash;
while ( <B> ) {
    chomp;
    my @B = split ",";
    push @{ $hash{$B[1]} }, $B[0];
}
while ( <A> ) {
    chomp;
    my @A = split ",";
    print join( ",", $A[0], $_, @A[2,3] )."\n" for @{ $hash{$A[1]} };
}

第四点：这种连接是数据库和SQL擅长的。如果您经常以不同的方式处理大量数据，请查看。如果您不熟悉数据库，也许首先尝试使用SQLite。您也可以在Perl中使用带有DBI模块的SQL，或者对于简单的情况使用my %ts; while ( <B> ) { chomp; my($timestamp, $key) = split ","; push @{ $ts{$key} }, $timestamp; } while ( <A> ) { chomp; my($life, $key, $id, $bool, $num) = split ","; for my $timestamp ( @{ $ts{$key} } ) { last if $num-- < 1; #escape for loop after $num prints print join( ",", $life, $timestamp, $id, $bool )."\n"; } }。

使用条件和循环加入2个文件

1 个答案: