比较Perl中的CSV文件

时间:2014-11-19 00:15:34

标签: perl csv

我有两个大型CSV文件,我需要在每日基础上进行比较:

文件1:

"Record number";"€ price"
"000001";"€ 19,95"
"000002";"€ 20,50"

file2的:

record number;price;date
000001;18,95;01-01-2014
000002;21,50;02-02-2014

file1包含每条记录后的换行符 file2包含每个记录后的换行符和回车符。

我正在寻找一种方法在Perl中根据记录号比较file1和file2,并在单独的文件中打印price列上的差异。 (记录号,价格和日期)。

希望任何人都能就如何处理此问题提出一些建议?

提前致谢!

1 个答案:

答案 0 :(得分:0)

我确信有更快更有效的方法可以做到这一点,但是下面应该完成你描述的工作。我通过创建一个哈希来比较这两个文件,因此我假设记录号是唯一的。我已经评论了代码来解释每一步

#!/usr/bin/perl  

use strict;
use warnings;

#declare hashes and arrays
my %file1_price;
my %file2_price;
my %file2_date;
my @file1_records;

#open the two files with the data
open (FILE1,'</path/to/file1.txt');
open (FILE2,'</path/to/file2.txt');

#open a third file to write the data out to
open (FILE3,'>/write/to/new_file.txt');

#get rid of the headers in both files
my @file1 = (<FILE1>);
my $file1_header = shift(@file1);

my @file2 = (<FILE2>);
my $file2_header = shift(@file2);

#start a loop in file 1 and change the data format to compare to file 2
foreach my $file1_data (@file1){
chomp $file1_data;

my ($file1_record, $file1_price) = split(/;/,$file1_data);

# get rid of the quotes in both variables
$file1_record =~s/"//g;
$file1_price =~s/"//g;

#get rid of the € and substitute the comma for a decimal point 
$file1_price =~s/€//g;
$file1_price =~s/,/./g;

#create a hash for the price for later comparison
$file1_price{$file1_record} = $file1_price;

push @file1_records, $file1_record;

}

#start a loop in file 2 similar to the above loop ready to compare to file 1    
foreach my $file2_data (@file2){
chomp $file2_data;

my ($file2_record, $file2_price, $file2_date) = split(/;/,$file2_data);

#substitute the comma for a decimal point
$file2_price =~s/,/./g;

#create hashes of the data for final comparison below
$file2_price{$file2_record} = $file2_price;
$file2_date{$file2_record} = $file2_date;

}   

#final loop to compare both files' information  
foreach my $record (@file1_records){

#find the difference between the two prices 
my $difference = $file1_price{$record} - $file2_price{$record};

#print out to file
print FILE3 "$record, $file1_price{$record}, $file2_price{$record}, $difference,$file2_date{$record}\n";    

}