解析Juniper防火墙日志到Perl中的文本文件

时间:2014-12-03 18:41:37

标签: perl logging

我是Perl和编程的新手。我在Unix中编写shell脚本的风险有限,并且一直在使用Camel书籍 Programming Perl,3rd Edition 以及我在网上遇到的各种perl教程。我正在尝试使用我们的Juniper防火墙每晚创建一个日志文件,并创建一个关于VPN会话的报告以供研究之用。我正在编写和修改一个脚本,该脚本将读取日志文件,从日志的每一行解析出几个变量,并将报告输出到格式如下的文本文件:

UserID DHCP          Logon    Timeout  Maxsession Logout   Closed   Duration
User1  xxx.xx.xxx.xx 06:23:47                     06:20:45 06:20:45 00:14:33
User2  xxx.xx.xxx.xx 08:01:59          16:01:59            16:01:59 00:57:27
User3  xxx.xx.xxx.xx 09:04:20 09:14:20                     09:14:24 00:10:00
User1  xxx.xx.xxx.xx 17:01:01                     18:05:01 18:05:01 01:04:00

The three cases I am interested in capturing are:
 1. User logs in, user logs out
 2. User logs in, user times out
 3. User logs in, max session reached user times out

我不确定如何处理时间戳以获得某些事件未提供的持续时间。有时会提供会话持续时间,但对于不是这样的事件,我需要弄清楚如何规范化适用的时间戳并进行计算以获得它。非常感谢任何想法或建议,谢谢!

操作

当用户登录时,会在日志文件中生成以下行:

  

Nov 30 09:02:45 100.10.10.100 Juniper:2014-12-08 09:02:02 - ive - [101.10.100.10] DOMAIN \ user(myRealm)[myRole] - VPN隧道:为用户启动会话IPv4地址为100.11.11.123,主机名为userHostName

当用户注销时,会在日志文件中生成以下行:

  

11月30日14:30:52 100.10.10.100 Juniper:2014-11-30 14:30:22 - ive - [10.1.100.100] user1(vpn1)[vpn1] - 从100.10.10.100退出(会话:12345678) )

     

11月30日14:30:52 100.10.10.100 Juniper:2014-11-30 14:30:22 - ive - [10.1.100.100] user1(vpn1)[] - 1234秒后与100.10.10.1的闭合连接,读取1234567字节,写入123456789字节

当用户超时时,会在日志文件中生成以下行:

  

11月30日14:30:52 100.10.10.100 Juniper:2014-11-30 14:30:22 - ive - [10.1.100.100] user1(vpn1)[] - 1234秒后与100.10.10.1的闭合连接,读取1234567字节,写入123456789字节   11月14日14:30:52 100.10.10.100 Juniper:2014-11-30 14:30:22 - ive - [10.1.100.100] user1(vpn1)[vpn1] - 会话超时用户/ vpn1(会话:00000000)由于不活动(最后访问时间为2014/11/30 13:43:20)。

当用户达到最大会话超时时,将在日志文件中生成以下行:

  

Nov 30 14:30:52 100.10.10.100 Juniper:2014-11-30 14:30:22 - ive - [10.1.100.100] user1(vpn1)[vpn1] - 用户/ vpn1的最大会话超时(会话) :00000000)   11月30日14:30:52 100.10.10.100 Juniper:2014-11-30 14:30:22 - ive - [10.1.100.100] user1(vpn1)[] - 1234秒后与100.10.10.1的闭合连接,1234567字节读取和写入123456789字节

到目前为止我的代码:

#!/usr/bin/perl
use warnings;
use strict;

#This script convert the specified log file to a report showing each user's ID, DHCP Address, Logon time,
#Logout time, Timeout time, and Maxtimout time. 

#Arrays needed for script
my @fields;
my @user;
my @dhcp;
my @login;
my @logout;
my @close;
my @timeout;
my @maxtimeout;

#Scalars needed for script
my $localtime = localtime();
my $input = '/home/user/bin/Temp/log.txt';
my $output = '>/home/user/bin/Temp/vpnreport.txt';
my $line;
my $fields;
my $userid;
my $jdate;
my $jtime;
my $dhcpaddr;
my $srcaddr;
my $sessionid;
my $sessiondur;
my $lastacctime;
my $lastaccdate;
my $bytesr;
my $bytesw;
my $timestamp;
my $maxrow = 0;
my $currow = 0;
my $i = 0;

#Open the log file
open (VPNLOG, $input) or die "Unable to open the input file:$!\n";

#Open the file(s) to be written to in clobber mode
open (VPNREPORT, $output) or die "Unable to open the output file:$!\n";

#Setup to while loop to process each line
while ($line = <VPNLOG>) {
chomp $line; #Remove the line breaks

#Strip the log's timestamp and IP
$line =~ s/.*Juniper:\s(.*)$/$1/;

#If line contains "Administrators" or "(Admin Users)" ignore it and move on to the next line
unless ($line =~ m/Administrators|(Admin Users)|System()/) {
#Split the line into the @fields array on every " " encountered
@fields = split (/ /, $line);
$jdate = $fields[0];                     #Juniper datestamp
$jdate =~ s/-//g;                        #Remove any occurance of "-" from the date stamp
$jtime = $fields[1];                     #Juniper timestamp
$userid = $fields[6];                    #User ID
$userid =~ s/XXXXXXX.|\(.*\)\[(.*)\]//g; #Remove the "XXXXXXX\" preceding the username and the "(Realm)[Role    ]"
                                         #trailing the username
#Normalize and recombine jtime and jdate here:
$timestamp = "$jdate $jtime";
#Check to see if line contains string "VPN Tunneling: Session started for user"

if ($line =~ m/VPN Tunneling: Session started for user/) {
    ++$maxrow;
    $dhcpaddr = $fields[17];          #Destination IP address
    $dhcpaddr =~ s/,//g;              #Remove "," trailing the IP address
    $user[$maxrow] = $userid;
    $dhcp[$maxrow] = $dhcpaddr;
    $login[$maxrow] = $timestamp;
    $logout[$maxrow] = "--";
    $close[$maxrow] = "--";
    $timeout[$maxrow] = "--";
    $maxtimeout[$maxrow] = "--";
    }

elsif ($line =~m/Logout/) {
    $dhcpaddr = $fields[10];           #DHCP IP address
    $sessionid = $fields[11];          #Session ID
    $sessionid =~ s/\(session:|\)//g;   #Remove the "(session:" and ")" from the session ID
    for ($currow = $maxrow; $currow >= 1; $currow--) {
        if ($user[$currow] eq $userid and $logout[$currow] eq "--") {
            $logout[$currow] = $timestamp;
            last;
        }
    }
}

elsif ($line =~m/Closed connection/) {
    $dhcpaddr = $fields[11];         #DHCP IP Address
    $sessiondur = $fields[13];       #Duration of session in seconds
    $bytesr = $fields[16];           #Bytes read
    $bytesw = $fields[20];           #Bytes written
    for ($currow = $maxrow; $currow >= 1; $currow--) {
         if ($user[$currow] eq $userid and $close[$currow] eq "--") {
             $close[$currow] = $timestamp;
             last;
         }
     }
}

elsif ($line =~m/Session timed out/) {
    $sessionid = $fields[13];          #Session ID
    $sessionid =~ s/\(session:|\)//g;  #Remove the "(session:" and ")" from the session ID
    $lastacctime = $fields[20];        #Last accessed time
    $lastaccdate = $fields[21];        #Last accessed date
    $lastaccdate =~ s/\).//g;          #Remove the ")" from the last access date
    for ($currow = $maxrow; $currow >= 1; $currow--) {
       if ($user[$currow] eq $userid and $timeout[$currow] eq "--") {
           $timeout[$currow] = $timestamp;
           last;
         }
     }
}

elsif ($line =~m/Max session timeout/) {
    $sessionid = $fields[13];          #Session ID
    $sessionid =~ s/\(session:|\).//g; #Remove the "(session:" and ")" from the session ID
    for ($currow = $maxrow; $currow >= 1; $currow--) {
         if ($user[$currow] eq $userid and $maxtimeout[$currow] eq "--") {
             $maxtimeout[$currow] = $timestamp;
             last;
        }
    }
}

    }
}

#Define the format then output file(s) using printf
#Print the Column headers: UserID, Logon, Logout, Timeout, Maxtimout, Close, Duration
printf VPNREPORT ("%-12s %-12s %-18s %-18s %-18s %-18s %-18s\n", "UserID", "DHCP", "Logon ", "Logout", "Timeout", "Maxtimout", "Close stamp");
print VPNREPORT "-------------------------------------------------------------------------------------------    ----------------------------\n";

#Newest record at top of report
#for ($i = $maxrow; $i >= 1; $i--) {

#Oldest record at top of report
for ($i = 0; $i <= $maxrow; $i++) {
printf VPNREPORT ("%-12s %-12s %-18s %-18s %-18s %-18s %-18s\n", $user[$i], $dhcp[$i], $login[$i], $logout[$    i], $timeout[$i], $maxtimeout[$i], $close[$i]);
}

#Close the input and output files
close (VPNLOG);
close (VPNREPORT);

4 个答案:

答案 0 :(得分:0)

而不是多次调用正则表达式匹配引擎(如果你正在经历的日志很多,这可能会成为一个性能问题)而不是蝙蝠,似乎有里程可以如果您首先使用连字符作为分隔符拆分每个日志行。 split在这里是个不错的选择。希望从那一点开始在相关的子串上运行正则表达式匹配更有效。

至于维护不一定相邻的行之间的上下文,我假设您将使用IP地址作为会话的标识符。哈希对象可能足以作为第一行选项,用于跟踪您已查看哪些IP消息。

答案 1 :(得分:0)

使用哈希来保存你的&#34;开始&#34;信息然后在&#34;结束&#34;线路进来。

例如,当您在每行中阅读时,我会解析日期并检查&#34; Junpier:&#34; (立即抛弃所有不符合一般标准的线路)。您也可以同时检查关键短语,然后在循环内部执行每个案例所需的特定处理:

my %starts;

while (defined(my $line = <LOGREPORT>)) {
    if ($line =~ s/^... (?:[\d ]\d) \d\d:\d\d:\d\d \S+ Junpier: (\d{4}-\d\d-\d\d \d\d:\d\d:\d\d) - \S+ - \S+ (\S+) \S+ - (Primary authentication successful|Logout|Closed Connection|Session timed out|Max Session time out)//) {
        my($time, $vpn, $state) = ($1, $2, $3);

        # TODO Normalize $time here if you wish

        if ($state eq 'Primary authentication successful') {
            $starts{$vpn} = $time;

        } elsif (defined(my $start = delete $starts{$time})) {
            # TODO Process other information needed from the line and output one line...
            # TODO Also you can use the normalized ($time - $start) for your duration if it isn't available on the rest of the line.

        } else {
            warn "No 'Primary authentication successful' found for: $vpn\n";
        }
    }
}

如果你能用一个大规模的正则表达式进行,其余的简单比较,它会很快。 当然,在这里提高效率真的至关重要吗?

答案 2 :(得分:0)

为了编码效率,我会使用正则表达式来隔离每种类型的行所需的字段。 (我假设报告需要每天生成几次,执行速度不是问题。)

我会使用散列哈希作为我的数据结构。第一个哈希的键是userIds。 first的值是对第二个哈希的引用。第二个哈希的键是操作(authentication,logout,sessionClose,timeOut,maxSession等)。该值将是操作的时间戳。 (这会将特定用户的所有数据整合到一个数据结构中。我还假设计算机中有足够的RAM来处理内存中的所有数据。我还假设只有操作的时间戳需要。)

捕获了所有数据,我将通过以排序的userId顺序访问哈希来生成报告,以生成报告的每一行。

另外两个想法:

我会考虑捕获日期以及捕获在午夜之前进行身份验证并在午夜之后结束会话的用户的时间。我会将日期时间信息存储为自纪元以来的秒数,以便更长时间地计算持续时间。

此外,我会确定日志文件是否包含GMT时间或本地时间。您需要此信息才能正确处理夏令时变化。

希望这有帮助。

答案 3 :(得分:0)

以下代码将执行我最初尝试使用脚本执行的操作。我学到了很多关于Perl这样做的知识,并感谢大家的反馈,即使我没有加入它。非常感谢!

#!/usr/bin/perl
use warnings;
use strict;

#This script convert the specified log file to a report showing each user's ID, DHCP Address, Logon time,
#Logout time, Timeout time, and Maxtimout time. 

#Arrays needed for script
my @fields;
my @user;
my @dhcp;
my @login;
my @logout;
my @close;
my @timeout;
my @maxtimeout;

#Scalars needed for script
my $localtime = localtime();
my $input = '/home/user/bin/Temp/log.txt';
my $output = '>/home/user/bin/Temp/vpnreport.txt';
my $line;
my $fields;
my $userid;
my $jdate;
my $jtime;
my $dhcpaddr;
my $srcaddr;
my $sessionid;
my $sessiondur;
my $lastacctime;
my $lastaccdate;
my $bytesr;
my $bytesw;
my $timestamp;
my $maxrow = 0;
my $currow = 0;
my $i = 0;

#Open the log file
open (VPNLOG, $input) or die "Unable to open the input file:$!\n";

#Open the file(s) to be written to in clobber mode
open (VPNREPORT, $output) or die "Unable to open the output file:$!\n";

#Setup to while loop to process each line
while ($line = <VPNLOG>) {
chomp $line; #Remove the line breaks

#Strip the log's timestamp and IP
$line =~ s/.*Juniper:\s(.*)$/$1/;

#If line contains "Administrators" or "(Admin Users)" ignore it and move on to the next line
unless ($line =~ m/Administrators|(Admin Users)|System()/) {
#Split the line into the @fields array on every " " encountered
@fields = split (/ /, $line);
$jdate = $fields[0];                     #Juniper datestamp
$jdate =~ s/-//g;                        #Remove any occurance of "-" from the date stamp
$jtime = $fields[1];                     #Juniper timestamp
$userid = $fields[6];                    #User ID
$userid =~ s/XXXXXXX.|\(.*\)\[(.*)\]//g; #Remove the "XXXXXXX\" preceding the username and the "(Realm)[Role    ]"
                                     #trailing the username
#Normalize and recombine jtime and jdate here:
$timestamp = "$jdate $jtime";
#Check to see if line contains string "VPN Tunneling: Session started for user"

if ($line =~ m/VPN Tunneling: Session started for user/) {
++$maxrow;
$dhcpaddr = $fields[17];          #Destination IP address
$dhcpaddr =~ s/,//g;              #Remove "," trailing the IP address
$user[$maxrow] = $userid;
$dhcp[$maxrow] = $dhcpaddr;
$login[$maxrow] = $timestamp;
$logout[$maxrow] = "--";
$close[$maxrow] = "--";
$timeout[$maxrow] = "--";
$maxtimeout[$maxrow] = "--";
}

elsif ($line =~m/Logout/) {
$dhcpaddr = $fields[10];           #DHCP IP address
$sessionid = $fields[11];          #Session ID
$sessionid =~ s/\(session:|\)//g;   #Remove the "(session:" and ")" from the session ID
for ($currow = $maxrow; $currow >= 1; $currow--) {
    if ($user[$currow] eq $userid and $logout[$currow] eq "--") {
        $logout[$currow] = $timestamp;
        last;
    }
  }
}

elsif ($line =~m/Closed connection/) {
$dhcpaddr = $fields[11];         #DHCP IP Address
$sessiondur = $fields[13];       #Duration of session in seconds
$bytesr = $fields[16];           #Bytes read
$bytesw = $fields[20];           #Bytes written
for ($currow = $maxrow; $currow >= 1; $currow--) {
     if ($user[$currow] eq $userid and $close[$currow] eq "--") {
         $close[$currow] = $timestamp;
         last;
     }
   }
}

elsif ($line =~m/Session timed out/) {
$sessionid = $fields[13];          #Session ID
$sessionid =~ s/\(session:|\)//g;  #Remove the "(session:" and ")" from the session ID
$lastacctime = $fields[20];        #Last accessed time
$lastaccdate = $fields[21];        #Last accessed date
$lastaccdate =~ s/\).//g;          #Remove the ")" from the last access date
for ($currow = $maxrow; $currow >= 1; $currow--) {
   if ($user[$currow] eq $userid and $timeout[$currow] eq "--") {
       $timeout[$currow] = $timestamp;
       last;
     }
   }
}

elsif ($line =~m/Max session timeout/) {
$sessionid = $fields[13];          #Session ID
$sessionid =~ s/\(session:|\).//g; #Remove the "(session:" and ")" from the session ID
for ($currow = $maxrow; $currow >= 1; $currow--) {
     if ($user[$currow] eq $userid and $maxtimeout[$currow] eq "--") {
         $maxtimeout[$currow] = $timestamp;
         last;
    }
  }
}

}
}

#Define the format then output file(s) using printf
#Print the Column headers: UserID, Logon, Logout, Timeout, Maxtimout, Close, Duration
printf VPNREPORT ("%-12s %-12s %-18s %-18s %-18s %-18s %-18s\n", "UserID", "DHCP", "Logon ", "Logout", "Timeout", "Maxtimout", "Close stamp");
print VPNREPORT "-------------------------------------------------------------------------------------------    ----------------------------\n";

#Newest record at top of report
#for ($i = $maxrow; $i >= 1; $i--) {

#Oldest record at top of report
for ($i = 0; $i <= $maxrow; $i++) {
printf VPNREPORT ("%-12s %-12s %-18s %-18s %-18s %-18s %-18s\n", $user[$i], $dhcp[$i], $login[$i], $logout[$    i], $timeout[$i], $maxtimeout[$i], $close[$i]);
}

#Close the input and output files
close (VPNLOG);
close (VPNREPORT);