我有2个CSV文件,file1.csv和file2.csv。我必须在file1中选择第3列的每一行并遍历file2的第3列以查找匹配项,如果匹配发生,则仅显示第3列中的file2.csv的完整匹配行(来自第1,2和3列) csv文件。到目前为止,我的代码只从两个csv文件中获取第3列。如何匹配两个文件的第3列并显示匹配的行?请帮忙。
File1:
Comp_Name,Date,Files
Component1,2013/04/01,/Com/src/folder1/folder2/newfile.txt;
Component1,2013/04/24,/Com/src/folder1/folder2/testfile24;
Component1,2013/04/24,/Com/src/folder1/folder2/testfile25;
Component1,2013/04/24,/Com/src/folder1/folder2/testfile26;
Component1,2013/04/25,/Com/src2;
File2:
Comp_name,Date,Files
Component1,2013/04/07,/Com/src/folder1/folder2/newfile.txt;
Component1,2013/04/24,/Com/src/folder1/folder2/testfile24;
Component1,2013/04/24,/Com/src/folder1/folder2/testfile25;
Component2,2013/04/23,/Com/src/folder1/folder2/newfile.txt;
Component3,2013/04/27,/Com/src/folder1/folder2/testfile24;
Component1,2013/04/25,/Com/src2;
Output format:
Comp_Name,Date,Files
Component1,2013/04/07,/Com/src/folder1/folder2/newfile.txt;
Component2,2013/04/23,/Com/src/folder1/folder2/newfile.txt;
Component1,2013/04/24,/Com/src/folder1/folder2/testfile24;
Component3,2013/04/27,/Com/src/folder1/folder2/testfile24;
Component1,2013/04/24,/Com/src/folder1/folder2/testfile25;
Component1,2013/04/25,/Com/src2;
代码:
use strict;
use warnings;
my $file1 = "C:\\pick\\file1.csv";
my $file2 = "C:\\pick\\file2.csv";
my $file3 = "C:\\pick\\file3.csv";
my $type;
my $type1;
my @fields;
my @fields2;
open(my $fh, '<:encoding(UTF-8)', $file1) or die "Could not open file '$file1' $!"; #Throw error if file doesn't open
while (my $row = <$fh>) # reading each row till end of file
{
chomp $row;
@fields = split ",",$row;
$type = $fields[2];
print"\n$type";
}
open(my $fh2, '<:encoding(UTF-8)', $file2) or die "Could not open file '$file2' $!"; #Throw error if file doesn't open
while (my $row2 = <$fh2>) # reading each row till end of file
{
chomp $row2;
@fields2 = split ",",$row2;
$type1 = $fields2[2];
print"\n$type1";
foreach($type)
{
if ($type eq $type1)
{
print $row2;
}
}
}
答案 0 :(得分:0)
这是哈希(我的%file1)
的工作因此,您可以将内容读入哈希值
,而不是不断打开文件@fields = split ",",$row;
$type = $fields[2];
$hash1{$type} = $row;
我发现你也有重复项,所以在复制时会替换哈希条目
所以你可以在哈希
中存储一个值数组$hash1{$type} = [] unless $hash1{$type};
push @{$hash1{$type}}, $row;
你的下一个问题是如何在哈希中遍历数组
答案 1 :(得分:0)
以下是使用我的Tie::Array::CSV模块的示例。它使用一些聪明的Perl技巧将每个CSV文件表示为arrayrefs的Perl数组。我用它来制作第一个文件的索引,然后循环遍历第二个文件,最后输出到第三个文件。
#!/usr/bin/env perl
use strict;
use warnings;
use Tie::Array::CSV;
tie my @file1, 'Tie::Array::CSV', 'file1' or die 'Cannot tie file1';
tie my @file2, 'Tie::Array::CSV', 'file2' or die 'Cannot tie file2';
tie my @output, 'Tie::Array::CSV', 'output' or die 'Cannot tie output';
# setup a match table from file2
my %match = map { ( $_->[-1] => 1 ) } @file1[1..$#file1];
#header
push @output, $file2[0];
# iterate over file2
for my $row ( @file2[1..$#file2] ) {
next unless $match{$row->[-1]}; # check for match
push @output, $row; # print to output if match
}
我得到的输出与您的输出不同,但我无法弄清楚为什么您的输出不包含testfile25
和src2
。
答案 2 :(得分:0)
这不是一个过于复杂的问题。我个人会使用模块Text::CSV_XS
或已经提到Tie::Array::CSV
来执行此处。
如果您在使用模块时遇到问题,我想这可能是另一种选择。您可以修改您想要的需求,我使用您提供的数据并获得您想要的结果。
use strict;
use warnings;
open my $fh1, '<', 'file1.csv' or die "failed open: $!";
open my $fh2, '<', 'file2.csv' or die "failed open: $!";
open my $out, '>', 'file3.csv' or die "failed open: $!";
my %hash1 = map { $_ => 1 } <$fh1>;
my %hash2 = map { $_ => 1 } <$fh2>;
close $fh1;
close $fh2;
my @result =
map { join ',', $hash1{$_->[2]} ? () : $_->[0], $_->[1], $_->[2] }
sort { $a->[1] <=> $b->[1] || $a->[2] cmp $b->[2] || $a->[0] cmp $b->[0] }
map { s/\s*$//; [split /,/] } keys %hash2;
print $out "$_\n" for @result;
close $out;
__OUTPUT__
Comp_name,Date,Files
Component1,2013/04/07,/Com/src/folder1/folder2/newfile.txt;
Component2,2013/04/23,/Com/src/folder1/folder2/newfile.txt;
Component1,2013/04/24,/Com/src/folder1/folder2/testfile24;
Component3,2013/04/27,/Com/src/folder1/folder2/testfile24;
Component1,2013/04/24,/Com/src/folder1/folder2/testfile25;
Component1,2013/04/25,/Com/src2;