我正在尝试从具有以下结构( tbl_20180615.txt )的原始文件中将主键构建到新文件中:
573103150033,0664,54,MSS02VEN*',INT,zxzc,,,,,
573103150033,0665,54,MSS02VEN,INT,zxzc,,,,,
573103150080,0659,29,MSS05ARA',INT,zxzc,,,,,
573103150080,0660,29,MSS05ARA ,INT,zxzc,,,,,
573103154377,1240,72,MSSTRI01,INT,zxzc,,,,,
573103154377,1240,72,MSSTRI01,INT,zxzc,,,,,
我启动perl Verify.pl
,然后发送参数,第一个是在必须发送文件名(原始文件)之后在新文件中构建主键的列数。 / p>
( Verify.pl )
#!/usr/bin/perl
use strict;
use warnings;
my $n1 = $ARGV[0];
my $name = $ARGV[1];
$n1 =~ s/"//g;
my $n2 = $n1 + 1;
my %seen;
my ( $file3 ) = qw(log.txt);
open my $fh3, '>', $file3 or die "Can't open $file3: $!";
print "Loading file ...\n";
open( my $file, "<", "$name" ) || die "Can't read file somefile.txt: $!";
while ( <$file> ) {
chomp;
my @rec = split( /,/, $_, $n2 ); #$n2 sirve para armar la primary key, hacer le split en los campos deseados
for ( my $i = 0; $i < $n1; $i++ ) {
print $fh3 "@rec[$i],";
}
print $fh3 "\n";
}
close( $file );
print "Done!\n";
#########检查重复项
my ($file4) = qw(log.txt);
print "Checking duplicates records...\n\n";
open (my $file4, "<", "log.txt") || die "Can't read file log.txt: $!";
while ( <$file4> ) {
print if $seen{$_}++;
}
close($file4);
如果我发送以下说明
perl Verify.pl 2 tbl_20180615.txt
此代码使用以下结构构建名为“ log.txt”的新文件,将原始文件()分成第一个参数指定的两列: ( log.txt )
573103150033,0664,
573103150033,0665,
573103150080,0659,
573103150080,0660,
573103154377,1240,
573103154377,1240,
那行得通,但是如果我想读取新文件log.txt
来检查重复项,那是行不通的,但是如果我注释行以生成文件log.txt
(上面列出)在代码中的行之前(################检查重复项###############)启动代码的下一部分工作正常,给我两条重复的行,看起来像这样:
(命令行结果)
573103154377,1240
573103154377,1240
我该如何解决这个问题?
答案 0 :(得分:0)
我认为这可以满足您的要求。它会在打印任何派生密钥之前构建一个唯一列表,使用散列检查是否已生成密钥
请注意,我已经为@ARGV
分配了值来模拟输入值。您必须先删除该语句,然后才能在命令行中通过输入运行程序
#!/usr/bin/perl
use strict;
use warnings;
use autodie; # Handle bad IO statuses automatically
local @ARGV = qw/ 2 tbl_20180615.txt /; # For testing only
tr/"//d for @ARGV; # "
my ($key_fields, $input_file) = @ARGV;
my $output_file = 'log.txt';
my (@keys, %seen);
print "Loading input ... ";
open my $in_fh, '<', $input_file;
while ( <$in_fh> ) {
chomp;
my @rec = split /,/;
my $key = join ',', @rec[0..$key_fields-1];
push @keys, $key unless $seen{$key}++;
}
print "Done\n";
open my $out_fh, '>', $output_file;
print $out_fh "$_\n" for @keys;
close $out_fh;
log.txt
573103150033,0664
573103150033,0665
573103150080,0659
573103150080,0660
573103154377,1240