我是perl的新手,并且发现了解决方案: Perl: Compare Two CSV Files and Print out differences
我已经经历了许多其他解决方案,这是最接近的,除了找到2个CSV文件之间的差异,我想找到第二个CSV文件与列和行中的第一个匹配的位置。如何修改以下脚本以查找列/行中的匹配项而不是差异。我希望剖析这段代码并从那里学习数组,但是想找出这个应用程序的解决方案。非常感谢。
use strict;
my @arr1;
my @arr2;
my $a;
open(FIL,"a.txt") or die("$!");
while (<FIL>)
{chomp; $a=$_; $a =~ s/[\t;, ]*//g; push @arr1, $a if ($a ne '');};
close(FIL);
open(FIL,"b.txt") or die("$!");
while (<FIL>)
{chomp; $a=$_; $a =~ s/[\t;, ]*//g; push @arr2, $a if ($a ne '');};
close(FIL);
my %arr1hash;
my %arr2hash;
my @diffarr;
foreach(@arr1) {$arr1hash{$_} = 1; }
foreach(@arr2) {$arr2hash{$_} = 1; }
foreach $a(@arr1)
{
if (not defined($arr2hash{$a}))
{
push @diffarr, $a;
}
}
foreach $a(@arr2)
{
if (not defined($arr1hash{$a}))
{
push @diffarr, $a;
}
}
print "Diff:\n";
foreach $a(@diffarr)
{
print "$a\n";
}
# You can print to a file instead, by: print FIL "$a\n";
好吧,我意识到这更像是我在寻找的东西:
use strict;
use warnings;
use feature qw(say);
use autodie;
use constant {
FILE_1 => "file1.txt",
FILE_2 => "file2.txt",
};
#
# Load Hash #1 with value from File #1
#
my %hash1;
open my $file1_fh, "<", FILE_1;
while ( my $value = <$file1_fh> ) {
chomp $value;
$hash1{$value} = 1;
}
close $file1_fh;
#
# Load Hash #2 with value from File #2
#
my %hash2;
open my $file2_fh, "<", FILE_2;
while ( my $value = <$file2_fh> ) {
chomp $value;
$hash2{$value} = 1;
}
close $file2_fh;
现在我想搜索file2的哈希来检查file1的哈希是否有任何匹配。这就是我被困住的地方
使用新代码建议,代码现在看起来像这样
#!/usr/bin/env perl
use strict;
use warnings;
use feature qw(say);
use autodie;
use constant {
FILE_1 => "masterlist.csv",
FILE_2 => "pastebin.csv",
};
#
# Load Hash #1 with value from File #1
#
my %hash1;
open my $file1_fh, "<", FILE_1;
while ( my $value = <$file1_fh> ) {
chomp $value;
$hash1{$value} = 1;
}
close $file1_fh;
my %hash2;
open my $file2_fh, "<", FILE_2;
while ( my $value = <$file2_fh> ) {
chomp $value;
if ( $hash1{$value} ) {
print "Match found $value\n";
$hash2{$value}++;
}
}
close $file2_fh;
print "Matches found:\n";
foreach my $key ( keys %hash2 ) {
print "$key found $hash2{$key} times\n";
}
我用split()更新了一个部分,它似乎有用,但是必须测试更多以确认它是否适合我正在寻找的解决方案,或者我还有更多的工作要做它
#
# Load Hash #1 with value from File #1
#
my %hash1;
open my $file1_fh, "<", FILE_1;
while ( my $value = <$file1_fh> ) {
chomp $value;
$hash1{$value} = ( %hash1, (split(/,/, $_))[1,2] );
}
close $file1_fh;
答案 0 :(得分:1)
因此,使用您的代码 - 您已将'file1'读入哈希。
为什么不将文件2读入哈希,而是:
my %hash2;
open my $file2_fh, "<", FILE_2;
while ( my $value = <$file2_fh> ) {
chomp $value;
if ( $hash1{$value} ) {
print "Match found $value\n";
$hash2{$value}++;
}
}
close $file2_fh;
print "Matches found:\n";
foreach my $key ( keys %hash2 ) {
print "$key found $hash2{$key} times\n";
}
答案 1 :(得分:0)
我认为此代码标识文件A中的数据字段与文件B中的数据字段匹配的每个位置(至少它在我的有限测试数据上):
use strict;
use warnings;
my @arr1;
my @arr2;
# a.txt -> @arr1
my $file_a_name = "poster_a.txt";
open(FIL,$file_a_name) or die("$!");
my $a_line_counter = 0;
while (my $a_line = <FIL>)
{
$a_line_counter = $a_line_counter + 1;
chomp($a_line);
my @fields = (split /,/,$a_line);
my $num_fields = scalar(@fields);
s{^\s+|\s+$}{}g foreach @fields;
push @arr1, \@fields if ( $num_fields ne 0);
};;
close(FIL);
my $file_b_name = "poster_b.txt";
open(FIL,$file_b_name) or die("$!");
while (my $b_line = <FIL>)
{
chomp($b_line);
my @fields = (split /,/,$b_line);
my $num_fields = scalar(@fields);
s{^\s+|\s+$}{}g foreach @fields;
push @arr2, \@fields if ( $num_fields ne 0)
};
close(FIL);
# b.txt -> @arr2
#print "\n",@arr2, "\n";
my @match_array;
my $file_a_line_ctr = 1;
foreach my $file_a_line_fields (@arr1)
{
my $file_a_column_ctr = 1;
foreach my $file_a_line_field (@{$file_a_line_fields})
{
my $file_b_line_ctr = 1;
foreach my $file_b_line_fields(@arr2)
{
my $file_b_column_ctr = 1;
foreach my $file_b_field (@{$file_b_line_fields})
{
if ( $file_b_field eq $file_a_line_field )
{
my $match_info =
"$file_a_name line $file_a_line_ctr column $file_a_column_ctr" .
" (${file_a_line_field}) matches: " .
"$file_b_name line $file_b_line_ctr column $file_b_column_ctr ";
push(@match_array, $match_info);
print "$match_info \n";
}
$file_b_column_ctr = $file_b_column_ctr + 1;
}
$file_b_line_ctr = $file_b_line_ctr + 1;
}
$file_a_column_ctr = $file_a_column_ctr + 1;
}
$file_a_line_ctr = $file_a_line_ctr + 1;
}
print "there were ", scalar(@match_array)," matches\n";