我有两个大型CSV文件,我需要在每日基础上进行比较:
文件1:
"Record number";"€ price"
"000001";"€ 19,95"
"000002";"€ 20,50"
file2的:
record number;price;date
000001;18,95;01-01-2014
000002;21,50;02-02-2014
file1包含每条记录后的换行符 file2包含每个记录后的换行符和回车符。
我正在寻找一种方法在Perl中根据记录号比较file1和file2,并在单独的文件中打印price列上的差异。 (记录号,价格和日期)。
希望任何人都能就如何处理此问题提出一些建议?
提前致谢!
答案 0 :(得分:0)
我确信有更快更有效的方法可以做到这一点,但是下面应该完成你描述的工作。我通过创建一个哈希来比较这两个文件,因此我假设记录号是唯一的。我已经评论了代码来解释每一步
#!/usr/bin/perl
use strict;
use warnings;
#declare hashes and arrays
my %file1_price;
my %file2_price;
my %file2_date;
my @file1_records;
#open the two files with the data
open (FILE1,'</path/to/file1.txt');
open (FILE2,'</path/to/file2.txt');
#open a third file to write the data out to
open (FILE3,'>/write/to/new_file.txt');
#get rid of the headers in both files
my @file1 = (<FILE1>);
my $file1_header = shift(@file1);
my @file2 = (<FILE2>);
my $file2_header = shift(@file2);
#start a loop in file 1 and change the data format to compare to file 2
foreach my $file1_data (@file1){
chomp $file1_data;
my ($file1_record, $file1_price) = split(/;/,$file1_data);
# get rid of the quotes in both variables
$file1_record =~s/"//g;
$file1_price =~s/"//g;
#get rid of the € and substitute the comma for a decimal point
$file1_price =~s/€//g;
$file1_price =~s/,/./g;
#create a hash for the price for later comparison
$file1_price{$file1_record} = $file1_price;
push @file1_records, $file1_record;
}
#start a loop in file 2 similar to the above loop ready to compare to file 1
foreach my $file2_data (@file2){
chomp $file2_data;
my ($file2_record, $file2_price, $file2_date) = split(/;/,$file2_data);
#substitute the comma for a decimal point
$file2_price =~s/,/./g;
#create hashes of the data for final comparison below
$file2_price{$file2_record} = $file2_price;
$file2_date{$file2_record} = $file2_date;
}
#final loop to compare both files' information
foreach my $record (@file1_records){
#find the difference between the two prices
my $difference = $file1_price{$record} - $file2_price{$record};
#print out to file
print FILE3 "$record, $file1_price{$record}, $file2_price{$record}, $difference,$file2_date{$record}\n";
}