我有网络日志文件,我遇到了很多麻烦,是perl的新手。我只需要一个脚本来查找找到的每个图像的计数。我能够列出它们,但我不确定如何获得计数,比如说“有x jpgs和x gifs被查看过”。
到目前为止我的代码看起来像这样:
use warnings;
open FILE, "jan28.log";
while ($line = <FILE>) {
if ($line =~ /.jpg/) {
print $line;
}
elsif ($line =~ /.gif/) {
print $line;
}
elsif ($line =~ /tiff/) {
print $line;
}
}
网络日志看起来像这样。
24.131.83.162 - - [28/Jan/2007:00:00:00 -0500] "GET /~taler/images/index_09.jpg HTTP/1.1" 200 1563
207.46.98.53 - - [28/Jan/2007:00:00:04 -0500] "GET /%7Edist/programs/PhD/PhDGuide/guideA.htm HTTP/1.0" 200 19090
74.6.74.184 - - [28/Jan/2007:00:00:12 -0500] "GET /%7Embsclass/hall_of_fame/myicon.ico HTTP/1.0" 200 760
58.68.24.3 - - [28/Jan/2007:00:00:16 -0500] "GET /~dtipper/tipper.html HTTP/1.1" 200 5896
58.68.24.3 - - [28/Jan/2007:00:00:16 -0500] "GET /~dtipper/gifs/head.jpg HTTP/1.1" 200 18318
答案 0 :(得分:2)
use strict;
use warnings;
use feature qw( say );
use URI qw( );
my $jpegs = 0;
my $gifs = 0;
while (<>) {
chomp;
my ($req, $code) = /^(?:\S+\s+){3}\[[^\]]*\] "([^"]*)"\s*(\S+)/
or next;
$code >= 200 && $code < 300
or next;
my ($meth, $url) = split(' ', $req);
$url = URI->new($url, 'http');
my $path = $url->path;
if ($path =~ /\.jpe?g\z/i) { ++$jpegs; }
elsif ($path =~ /\.gif\z/i ) { ++$gifs; }
}
say "There were $jpegs jpgs and $gifs gifs viewed";
答案 1 :(得分:0)
尝试这样做(在shell中):
perl -wane '
END{
print "there\047s was $hash{$_} items for $_\n" for sort keys %hash;
}
$key = $1 if m!.*\.(jpe?g|gif|ico)\b!i;
$hash{$key}++
' filename.txt
如果您想要一个具有相同逻辑的真实脚本,Deparse
模块将有助于:
$ perl -MO=Deparse -wane '
END{
print "there\047s was $hash{$_} items for $_\n" for sort keys %hash;
}
$key = $1 if m!.*\.(jpe?g|gif|ico)\b!i;
$hash{$key}++
' filename.txt
“Deparsed”结果脚本:
BEGIN { $^W = 1; }
LINE: while (defined($_ = <ARGV>)) {
our(@F) = split(' ', $_, 0);
sub END {
print "there's was $hash{$_} items for $_\n" foreach (sort keys %hash);
}
$key = $1 if /.*\.(jpe?g|gif|ico)\b/i;
++$hash{$key};
}
-e syntax OK
答案 2 :(得分:0)
这是一个基本示例,但CPAN中可能有一些Log Parser模块。
use File::Open::OOP qw(oopen);
use Data::Dump qw(dump);
my $fh = oopen 'log';
my %hash;
while ( my $row = $fh->readline ) {
$row =~ s/.*\"GET\ \/.*\.(\w+)\ .*\n$/$1/;
$ext = $row;
$hash{$ext} += 1;
}
dump(%hash);
样品的输出:
$ perl script.pl
(“html”,1,“ico”,1,“jpg”,2,“htm”,1)
$