使用日期时间搜索日志文件

时间:2015-07-20 17:48:27

标签: perl datetime

我正在从日志文件中读取并希望选项将搜索限制为特定的日期范围。日志文件中的行采用以下格式May 27 09:33:33。我已经将日期与日志文件的每一行中的其余文本分开了。我只想写一个像这样的声明

if(the date falls between June 10th and June 20th)

所以我只想作一个例子来获取当前时间

use DateTime;

my $dt   = DateTime->now;
my $date = $dt->md;  
my $time = $dt->hms;   

但是不会把它放在mm-dd?

的格式中

1 个答案:

答案 0 :(得分:6)

您应该使用时间戳/纪元进行比较。这是一个例子:

#!/usr/bin/env perl                                                                         

use DateTime::Format::Strptime;
use DateTime;

my $year = DateTime->now->year;

my $date_parser = DateTime::Format::Strptime->new(
    pattern => '%Y %B %d', # YYYY Month DD
);

my $start_date = 'June 10';
my $end_date   = 'June 20';
my $start_epoch = $date_parser->parse_datetime("$year $start_date")
                              ->epoch();
my $end_epoch   = $date_parser->parse_datetime("$year $end_date")
                              ->add( days => 1 )
                              ->epoch(); # Add one to get next day                                                                

my $parser = DateTime::Format::Strptime->new(
    pattern => '%Y %b %d %T', # YYYY Mon DD HH:MM:SS                                        
);

print "Start Epoch : $start_epoch [ $start_date ]\n";
print "End   Epoch : $end_epoch [ $end_date ]\n";

for my $log_date ('May 27 09:33:33',
                  'Jun 05 09:33:33',
                  'Jun 10 09:33:33',
                  'Jun 20 09:33:33',
                  'Jun 30 09:33:33',) {
    my $epoch = $parser->parse_datetime("$year $log_date")->epoch();
    print "Log   Epoch : $epoch [ $log_date ]\n";
    if ( $start_epoch <= $epoch and $epoch < $end_epoch) {
        # Less than end_epoch (midnight) to match previous day                              
        print "==> Log Epoch is in range\n";
    }
}

输出以下内容:

Start Epoch : 1433894400 [ June 10 ]
End   Epoch : 1434844800 [ June 20 ]
Log   Epoch : 1432719213 [ May 27 09:33:33 ]
Log   Epoch : 1433496813 [ Jun 05 09:33:33 ]
Log   Epoch : 1433928813 [ Jun 10 09:33:33 ]
==> Log Epoch is in range
Log   Epoch : 1434792813 [ Jun 20 09:33:33 ]
==> Log Epoch is in range
Log   Epoch : 1435656813 [ Jun 30 09:33:33 ]

在不使用核心库的情况下计算纪元日期是不明智的,因为现在您需要担心自unix诞生日期(1970年1月1日)以来的几天,闰日,闰秒,并且您将尝试这么多边缘案例破坏你的乐趣。有很多方法可以解决这个问题。但是有另一种选择:

如果出于某种原因,您反对使用核心库模块,则可以通过将日期转换为规范形式,然后只选择属于该范围的日期来搜索日志文件。

以下是相同的示例,但不使用任何模块,而是使用规范化(规范)日期:

#!/usr/bin/env perl

use strict;
use warnings;

my %months = ( jan => 1, feb => 2,  mar => 3,  apr => 4,
               may => 5, jun => 6,  jul => 7,  aug => 8,
               sep => 9, oct => 10, nov => 11, dec => 12 );

my $year = 2015; # TODO: what year is it? Need to worry about Dec/Jan rollover

my @log_dates = (
    'May 27 09:33:33',
    'Jun 05 09:33:33',
    'Jun 10 09:33:33',
    'Jun 20 09:33:33',
    'Jun 30 09:33:33',
);

my $start_date = 'June 10';
my $end_date   = 'June 20';
my $start_canonical = canonical_date_for_mmmdd_hhmmss("$year $start_date 00:00:00");
my $end_canonical   = canonical_date_for_mmmdd_hhmmss("$year $end_date 23:59:59");

for my $log_date (@log_dates) {
    my $canonical_date = canonical_date_for_mmmdd_hhmmss("$year $log_date");
    print "Log Canonical Date : $canonical_date [ $log_date ]\n";
    if ($start_canonical <= $canonical_date and
        $canonical_date  <= $end_canonical) {
        print "===> Date in range\n";
    }
}

sub canonical_date_for_mmmdd_hhmmss {
    my ($datestr) = @_;
    my ($year, $mon, $day, $hr, $min, $sec) =
        $datestr =~ m|^(\d+)\s+(\w+)\s+(\d+)\s+(\d+):(\d+):(\d+)$|; # YYYY Month DD HH:MM:SS
    $year > 1900
        or die "Unable to handle year '$year'";
    my $month_first_three = lc( substr($mon,0,3) );
    my $month_num = $months{$month_first_three};
    defined $month_num
        or die "Unable to handle month '$mon'";
    (1 <= $day and $day <= 31)
        or die "Unable to handle day '$day'";
    (0 <= $hr and $hr <= 23)
        or die "Unable to handle hour '$hr'";
    (0 <= $min and $min <= 59)
        or die "Unable to handle minute '$min'";
    (0 <= $sec and $sec <= 59)
        or die "Unable to handle second '$sec'";
    my $fmt = "%04d%02d%02d%02d%02d%02d"; # YYYYMMDDHHMMSS
    return sprintf($fmt, $year, $month_num, $day, $hr, $min, $sec);
}

其中输出以下内容:

Log Canonical Date : 20150527093333 [ May 27 09:33:33 ]
Log Canonical Date : 20150605093333 [ Jun 05 09:33:33 ]
Log Canonical Date : 20150610093333 [ Jun 10 09:33:33 ]
===> Date in range
Log Canonical Date : 20150620093333 [ Jun 20 09:33:33 ]
===> Date in range
Log Canonical Date : 20150630093333 [ Jun 30 09:33:33 ]

有关使用规范化/规范时间戳的其他属性,另请参阅ISO 8601数据元素和交换格式)。