编写文件并在Perl中读取

时间:2018-08-23 15:30:59

标签: perl

我正在尝试从具有以下结构( tbl_20180615.txt )的原始文件中将主键构建到新文件中:

573103150033,0664,54,MSS02VEN*',INT,zxzc,,,,,
573103150033,0665,54,MSS02VEN,INT,zxzc,,,,,
573103150080,0659,29,MSS05ARA',INT,zxzc,,,,,
573103150080,0660,29,MSS05ARA ,INT,zxzc,,,,,
573103154377,1240,72,MSSTRI01,INT,zxzc,,,,,
573103154377,1240,72,MSSTRI01,INT,zxzc,,,,,

我启动perl Verify.pl,然后发送参数,第一个是在必须发送文件名(原始文件)之后在新文件中构建主键的列数。 / p>

Verify.pl

#!/usr/bin/perl

use strict;
use warnings;

my $n1   = $ARGV[0];
my $name = $ARGV[1];

$n1 =~ s/"//g;
my $n2 = $n1 + 1;

my %seen;

my ( $file3 ) = qw(log.txt);
open my $fh3, '>', $file3 or die "Can't open $file3: $!";

print "Loading file ...\n";
open( my $file, "<", "$name" ) || die "Can't read file somefile.txt: $!";

while ( <$file> ) {

    chomp;
    my @rec = split( /,/, $_, $n2 );    #$n2 sirve para armar la primary key, hacer le split en los campos deseados

    for ( my $i = 0; $i < $n1; $i++ ) {
        print $fh3 "@rec[$i],";
    }

    print $fh3 "\n";
}

close( $file );

print "Done!\n";
#########检查重复项
my ($file4) = qw(log.txt);

print "Checking duplicates records...\n\n";

open (my $file4, "<", "log.txt") || die "Can't read file log.txt: $!";

while ( <$file4> ) { 
    print if $seen{$_}++;
}

close($file4);

如果我发送以下说明

perl Verify.pl 2 tbl_20180615.txt

此代码使用以下结构构建名为“ log.txt”的新文件,将原始文件()分成第一个参数指定的两列: ( log.txt

573103150033,0664,
573103150033,0665,
573103150080,0659,
573103150080,0660,
573103154377,1240,
573103154377,1240,

那行得通,但是如果我想读取新文件log.txt来检查重复项,那是行不通的,但是如果我注释行以生成文件log.txt(上面列出)在代码中的行之前(################检查重复项###############)启动代码的下一部分工作正常,给我两条重复的行,看起来像这样: (命令行结果

573103154377,1240
573103154377,1240

我该如何解决这个问题?

1 个答案:

答案 0 :(得分:0)

我认为这可以满足您的要求。它会在打印任何派生密钥之前构建一个唯一列表,使用散列检查是否已生成密钥

请注意,我已经为@ARGV分配了值来模拟输入值。您必须先删除该语句,然后才能在命令行中通过输入运行程序

#!/usr/bin/perl

use strict;
use warnings;
use autodie;  # Handle bad IO statuses automatically

local @ARGV = qw/ 2 tbl_20180615.txt /; # For testing only

tr/"//d for @ARGV;  # "

my ($key_fields, $input_file) = @ARGV;
my $output_file = 'log.txt';

my (@keys, %seen);

print "Loading input ... ";
open my $in_fh, '<', $input_file;

while ( <$in_fh> ) {

    chomp;
    my @rec = split /,/;

    my $key = join ',', @rec[0..$key_fields-1];

    push @keys, $key unless $seen{$key}++;
}

print "Done\n";

open my $out_fh, '>', $output_file;
print $out_fh "$_\n" for @keys;
close $out_fh;

输出log.txt

573103150033,0664
573103150033,0665
573103150080,0659
573103150080,0660
573103154377,1240