我有以下格式的文件。
DATE Time, v1,v2,v3
05:33:25,n1,n2,n3
05:34:25,n4,n5,n5
05:35:24,n6,n7,n8
and so on upto 05:42:25.
我想每隔5分钟计算一次值v1,v2和v3。我写了下面的示例代码。
while (<STDIN>) {
my ($dateTime, $v1, $v2, $v3) = split /,/, $_;
my ($date, $time) = split / /, $dateTime;
}
我可以读取所有值,但需要帮助以每5分钟间隔对所有值求和。任何人都可以建议我每隔5分钟添加时间和值的代码。
必需的输出
05:33 v1(sum 05:33 to 05:37) v2(sum 05:33 to 05:33) v3(sum 05:33 to 05:33)
05:38 v1(sum 05:38 to 05:42) v2(sum 05:38 to 05:42) v3(sum 05:38 to 05:42)
and so on..
答案 0 :(得分:1)
代码是以下SinanÜnür的 previous 答案的变体,除了:
(1)函数timelocal将允许你在日,月,年中阅读 - 所以你可以总结任何五分钟的差距。
(2)应该处理最终时间差为&lt; 5分钟。
#!/usr/bin/perl -w
use strict;
use warnings;
use Time::Local;
use POSIX qw(strftime);
my ( $start_time, $end_time, $current_time );
my ( $totV1, $totV2, $totV3 ); #totals in time bands
while (<DATA>) {
my ( $hour, $min, $sec, $v1, $v2, $v3 ) =
( $_ =~ /(\d+)\:(\d+)\:(\d+)\,(\d+),(\d+),(\d+)/ );
#convert time to epoch seconds
$current_time =
timelocal( $sec, $min, $hour, (localtime)[ 3, 4, 5 ] ); #sec,min,hr
if ( !$end_time ) {
$start_time = $current_time;
$end_time = $start_time + 5 * 60; #plus 5 min
}
if ( $current_time <= $end_time ) {
$totV1 += $v1;
$totV2 += $v2;
$totV3 += $v3;
}
else {
print strftime( "%H:%M:%S", localtime($start_time) ),
" $totV1,$totV2,$totV3\n";
$start_time = $current_time;
$end_time = $start_time + 5 * 60; #plus 5 min
( $totV1, $totV2, $totV3 ) = ( $v1, $v2, $v3 );
}
}
#Print results of final loop (if required)
if ( $current_time <= $end_time ) {
print strftime( "%H:%M:%S", localtime($start_time) ),
" $totV1,$totV2,$totV3\n";
}
__DATA__
05:33:25,29,74,96
05:34:25,41,69,95
05:35:25,24,38,55
05:36:25,96,63,70
05:37:25,84,65,74
05:38:25,78,58,93
05:39:25,51,38,19
05:40:25,86,40,64
05:41:25,80,68,65
05:42:25,4,93,81
输出:
05:33:25 352,367,483
05:39:25 221,239,229
答案 1 :(得分:0)
显然,由于缺乏样本数据,没有经过多少测试。要解析CSV,请使用Text::CSV_XS或Text::xSV而不是下面的幼稚split
。
注意:
此代码 不 如果输入数据有间隙,请确保输出连续五分钟。
如果有多天的时间戳,您将遇到问题。事实上,如果时间戳不是24小时格式,即使数据来自一天,您也会遇到问题。
有了这些警告,它仍然应该给你一个起点。
#!/usr/bin/perl
use strict;
use warnings;
my $split_re = qr/ ?, ?/;
my @header = split $split_re, scalar <DATA>;
my @data;
my $time_block = 0;
while ( my $data = <DATA> ) {
last unless $data =~ /\S/;
chomp $data;
my ($ts, @vals) = split $split_re, $data;
my ($hr, $min, $sec) = split /:/, $ts;
my $secs = 3600*$hr + 60*$min + $sec;
if ( $secs > $time_block + 300 ) {
$time_block = $secs;
push @data, [ $time_block ];
}
for my $i (1 .. @vals) {
$data[-1]->[$i] += $vals[$i - 1];
}
}
print join(', ', @header);
for my $row ( @data ) {
my $ts = shift @$row;
print join(', ',
sprintf('%02d:%02d', (localtime($ts))[2,1])
, @$row
), "\n";
}
__DATA__
DATE Time, v1,v2,v3
05:33:25,1,3,5
05:34:25,2,4,6
05:35:24,7,8,9
05:55:24,7,8,9
05:57:24,7,8,9
输出:
DATE Time, v1, v2, v3 05:33, 10, 15, 20 05:55, 14, 16, 18
答案 2 :(得分:0)
这是Perl要解决的一个很好的问题。最难的部分是从datetime字段中获取值并确定它属于哪个5分钟的存储桶。其余的只是哈希。
my (%v1,%v2,%v3);
while (<STDIN>) {
my ($datetime,$v1,$v2,$v3) = split /,/, $_;
my ($date,$time) = split / /, $datetime;
my $bucket = &get_bucket_for($time);
$v1{$bucket} += $v1;
$v2{$bucket} += $v2;
$v3{$bucket} += $v3;
}
foreach my $bucket (sort keys %v1) {
print "$bucket $v1{$bucket} $v2{$bucket} $v3{$bucket}\n";
}
这是您实施&get_bucket_for
的一种方式:
my $first_hhmm;
sub get_bucket_for {
my ($time) = @_;
my ($hh,$mm) = split /:/, $time; # looks like seconds are not important
# buckets are five minutes apart, but not necessarily at multiples of 5 min
# (i.e., buckets could go 05:33,05:38,... instead of 05:30,05:35,...)
# Use the value from the first time this function is called to decide
# what the starting point of the buckets is.
if (!defined $first_hhmm) {
$first_hhmm = $hh * 60 + $mm;
}
my $bucket_index = int(($hh * 60 + $mm - $first_hhmm) / 5);
my $bucket_start = $first_hhmm + 5 * $bucket_index;
return sprintf "%02d:%02d", $bucket_start / 60, $bucket_start % 60;
}
答案 3 :(得分:0)
我不确定为什么你会使用从第一次开始的时间,而不是每隔5分钟(00 - 05,05 - 10等),但这是一个快速而肮脏的方式来做到这一点方式:
my %output;
my $last_min = -10; # -10 + 5 is less than any positive int.
while (<STDIN>) {
my ($dt, $v1, $v2, $v3) = split(/,/, $_);
my ($h, $m, $s) = split(/:/, $dt);
my $ts = $m + ($h * 60);
if (($last_min + 5) < $ts) {
$last_min = $ts;
}
$output{$last_min}{1} += $v1;
$output{$last_min}{2} += $v2;
$output{$last_min}{3} += $v3;
}
foreach my $ts (sort {$a <=> $b} keys %output) {
my $hour = int($ts / 60);
my $minute = $ts % 60;
printf("%01d:%02d v1(%i) v2(%i) v3(%i)\n", (
$hour,
$minute,
$output{$ts}{1},
$output{$ts}{2},
$output{$ts}{3},
));
}
不确定为什么你会这样做,但是在那里你进入程序Perl,例如。如果您需要更多printf
格式go here。