我有一个csv文件(非常大),格式如下。
key1,val1,val2,val3... ,valn
key2,val2,val5,val1....,valn
...
...
keyn,val7,val9,val11....,valn
key1,val2,val4,val8.....,valn
key2,val10,val12,val14..., valn
...
...
keyn,val2,val4,val8.....,valn
key1,val3,val5,val7... ,valn
key2,val0,val9,val3....,valn
key1 to keyn(及其值)在csv文件中重复多次。
值(val1,valn)是double(float)。
我想要打印的内容:
1)从文件的开头,对于每个键,我想计算列值(例如val2,val4,val6)与下一次出现的键之间的差异。
所以例如
key1,2,4,6
key2,3,5,7
...
...
key1,4,6,8
key2,4,6,8
我想打印
key1:来自先前记录的差异是key1,2,2,2 key2:来自先前记录的差异是key2,1,1,1 ..
keyn:以前记录的差异是...........
2)对每个连续出现的每个键重复执行此操作。
这就是我的目标(以哈希值存储值)
#!/usr/bin/perl
my %hash;
open my $fh, '<', 'file1.csv' or die "Cannot open: $!";
while (my $line = <$fh>) {
$line =~ s/\s*\z//;
my @array = split /,/, $line;
my $key = shift @array;
$hash{$key} = \@array;
}
close $fh;
答案 0 :(得分:2)
您可以尝试:
# get the key.
my $key = shift @array;
# see if the key is already seen.
if(exists $hash{$key} ) {
# get ref to previous record of this key.
my $ref = $hash{$key};
# print key.
print "$key,";
# a new array.
my @new_array;
# populate the new array.
for(my $i=0;$i<=$#array;$i++) {
$new_array[$i] = $array[$i] - $$ref[$i];
}
# join the array elements with comma.
print join",",@new_array;
print "\n";
}
# add/replace the current array as value for the current key.
$hash{$key} = \@array;
答案 1 :(得分:0)
我的尝试:
use strict;
use warnings;
use Text::CSV_XS;
use Math::Matrix;
my $csv = Text::CSV_XS->new({binary => 1});
my %hash;
my @results;
open my $fh, '<', 'file1.csv' or die "Cannot open: $!";
while (my $line = <$fh>) {
if ($csv->parse($line)) {
my @array = $csv->fields;
my $key = shift @array;
if (! exists $hash{$key}) {
$hash{$key} = \@array;
next;
}
my $previous_record = Math::Matrix->new($hash{$key});
my $current_record = Math::Matrix->new(\@array);
my $new_record = $previous_record->add($current_record->negative);
push @results, @$new_record;
$hash{$key} = \@array;
}
else {
my $err = $csv->error_input;
print "error parsing: $err\n";
}
}