我是Perl和编程的新手。我在Unix中编写shell脚本的风险有限,并且一直在使用Camel书籍 Programming Perl,3rd Edition 以及我在网上遇到的各种perl教程。我正在尝试使用我们的Juniper防火墙每晚创建一个日志文件,并创建一个关于VPN会话的报告以供研究之用。我正在编写和修改一个脚本,该脚本将读取日志文件,从日志的每一行解析出几个变量,并将报告输出到格式如下的文本文件:
UserID DHCP Logon Timeout Maxsession Logout Closed Duration
User1 xxx.xx.xxx.xx 06:23:47 06:20:45 06:20:45 00:14:33
User2 xxx.xx.xxx.xx 08:01:59 16:01:59 16:01:59 00:57:27
User3 xxx.xx.xxx.xx 09:04:20 09:14:20 09:14:24 00:10:00
User1 xxx.xx.xxx.xx 17:01:01 18:05:01 18:05:01 01:04:00
The three cases I am interested in capturing are:
1. User logs in, user logs out
2. User logs in, user times out
3. User logs in, max session reached user times out
Nov 30 09:02:45 Juniper:2014-12-08 09:02:02 - ive - [] DOMAIN \ user(myRealm)[myRole] - VPN隧道:为用户启动会话IPv4地址为100.11.11.123,主机名为userHostName
11月30日14:30:52 Juniper:2014-11-30 14:30:22 - ive - [] user1(vpn1)[vpn1] - 从100.10.10.100退出(会话:12345678) )
11月30日14:30:52 Juniper:2014-11-30 14:30:22 - ive - [] user1(vpn1)[] - 1234秒后与100.10.10.1的闭合连接,读取1234567字节,写入123456789字节
11月30日14:30:52 Juniper:2014-11-30 14:30:22 - ive - [] user1(vpn1)[] - 1234秒后与100.10.10.1的闭合连接,读取1234567字节,写入123456789字节 11月14日14:30:52 Juniper:2014-11-30 14:30:22 - ive - [] user1(vpn1)[vpn1] - 会话超时用户/ vpn1(会话:00000000)由于不活动(最后访问时间为2014/11/30 13:43:20)。
Nov 30 14:30:52 Juniper:2014-11-30 14:30:22 - ive - [] user1(vpn1)[vpn1] - 用户/ vpn1的最大会话超时(会话) :00000000) 11月30日14:30:52 Juniper:2014-11-30 14:30:22 - ive - [] user1(vpn1)[] - 1234秒后与100.10.10.1的闭合连接,1234567字节读取和写入123456789字节
use warnings;
use strict;
#This script convert the specified log file to a report showing each user's ID, DHCP Address, Logon time,
#Logout time, Timeout time, and Maxtimout time.
#Arrays needed for script
my @fields;
my @user;
my @dhcp;
my @login;
my @logout;
my @close;
my @timeout;
my @maxtimeout;
#Scalars needed for script
my $localtime = localtime();
my $input = '/home/user/bin/Temp/log.txt';
my $output = '>/home/user/bin/Temp/vpnreport.txt';
my $line;
my $fields;
my $userid;
my $jdate;
my $jtime;
my $dhcpaddr;
my $srcaddr;
my $sessionid;
my $sessiondur;
my $lastacctime;
my $lastaccdate;
my $bytesr;
my $bytesw;
my $timestamp;
my $maxrow = 0;
my $currow = 0;
my $i = 0;
#Open the log file
open (VPNLOG, $input) or die "Unable to open the input file:$!\n";
#Open the file(s) to be written to in clobber mode
open (VPNREPORT, $output) or die "Unable to open the output file:$!\n";
#Setup to while loop to process each line
while ($line = <VPNLOG>) {
chomp $line; #Remove the line breaks
#Strip the log's timestamp and IP
$line =~ s/.*Juniper:\s(.*)$/$1/;
#If line contains "Administrators" or "(Admin Users)" ignore it and move on to the next line
unless ($line =~ m/Administrators|(Admin Users)|System()/) {
#Split the line into the @fields array on every " " encountered
@fields = split (/ /, $line);
$jdate = $fields[0]; #Juniper datestamp
$jdate =~ s/-//g; #Remove any occurance of "-" from the date stamp
$jtime = $fields[1]; #Juniper timestamp
$userid = $fields[6]; #User ID
$userid =~ s/XXXXXXX.|\(.*\)\[(.*)\]//g; #Remove the "XXXXXXX\" preceding the username and the "(Realm)[Role ]"
#trailing the username
#Normalize and recombine jtime and jdate here:
$timestamp = "$jdate $jtime";
#Check to see if line contains string "VPN Tunneling: Session started for user"
if ($line =~ m/VPN Tunneling: Session started for user/) {
$dhcpaddr = $fields[17]; #Destination IP address
$dhcpaddr =~ s/,//g; #Remove "," trailing the IP address
$user[$maxrow] = $userid;
$dhcp[$maxrow] = $dhcpaddr;
$login[$maxrow] = $timestamp;
$logout[$maxrow] = "--";
$close[$maxrow] = "--";
$timeout[$maxrow] = "--";
$maxtimeout[$maxrow] = "--";
elsif ($line =~m/Logout/) {
$dhcpaddr = $fields[10]; #DHCP IP address
$sessionid = $fields[11]; #Session ID
$sessionid =~ s/\(session:|\)//g; #Remove the "(session:" and ")" from the session ID
for ($currow = $maxrow; $currow >= 1; $currow--) {
if ($user[$currow] eq $userid and $logout[$currow] eq "--") {
$logout[$currow] = $timestamp;
elsif ($line =~m/Closed connection/) {
$dhcpaddr = $fields[11]; #DHCP IP Address
$sessiondur = $fields[13]; #Duration of session in seconds
$bytesr = $fields[16]; #Bytes read
$bytesw = $fields[20]; #Bytes written
for ($currow = $maxrow; $currow >= 1; $currow--) {
if ($user[$currow] eq $userid and $close[$currow] eq "--") {
$close[$currow] = $timestamp;
elsif ($line =~m/Session timed out/) {
$sessionid = $fields[13]; #Session ID
$sessionid =~ s/\(session:|\)//g; #Remove the "(session:" and ")" from the session ID
$lastacctime = $fields[20]; #Last accessed time
$lastaccdate = $fields[21]; #Last accessed date
$lastaccdate =~ s/\).//g; #Remove the ")" from the last access date
for ($currow = $maxrow; $currow >= 1; $currow--) {
if ($user[$currow] eq $userid and $timeout[$currow] eq "--") {
$timeout[$currow] = $timestamp;
elsif ($line =~m/Max session timeout/) {
$sessionid = $fields[13]; #Session ID
$sessionid =~ s/\(session:|\).//g; #Remove the "(session:" and ")" from the session ID
for ($currow = $maxrow; $currow >= 1; $currow--) {
if ($user[$currow] eq $userid and $maxtimeout[$currow] eq "--") {
$maxtimeout[$currow] = $timestamp;
#Define the format then output file(s) using printf
#Print the Column headers: UserID, Logon, Logout, Timeout, Maxtimout, Close, Duration
printf VPNREPORT ("%-12s %-12s %-18s %-18s %-18s %-18s %-18s\n", "UserID", "DHCP", "Logon ", "Logout", "Timeout", "Maxtimout", "Close stamp");
print VPNREPORT "------------------------------------------------------------------------------------------- ----------------------------\n";
#Newest record at top of report
#for ($i = $maxrow; $i >= 1; $i--) {
#Oldest record at top of report
for ($i = 0; $i <= $maxrow; $i++) {
printf VPNREPORT ("%-12s %-12s %-18s %-18s %-18s %-18s %-18s\n", $user[$i], $dhcp[$i], $login[$i], $logout[$ i], $timeout[$i], $maxtimeout[$i], $close[$i]);
#Close the input and output files
close (VPNLOG);
close (VPNREPORT);
答案 0 :(得分:0)
而不是多次调用正则表达式匹配引擎(如果你正在经历的日志很多,这可能会成为一个性能问题)而不是蝙蝠,似乎有里程可以如果您首先使用连字符作为分隔符拆分每个日志行。 split
答案 1 :(得分:0)
例如,当您在每行中阅读时,我会解析日期并检查&#34; Junpier:&#34; (立即抛弃所有不符合一般标准的线路)。您也可以同时检查关键短语,然后在循环内部执行每个案例所需的特定处理:
my %starts;
while (defined(my $line = <LOGREPORT>)) {
if ($line =~ s/^... (?:[\d ]\d) \d\d:\d\d:\d\d \S+ Junpier: (\d{4}-\d\d-\d\d \d\d:\d\d:\d\d) - \S+ - \S+ (\S+) \S+ - (Primary authentication successful|Logout|Closed Connection|Session timed out|Max Session time out)//) {
my($time, $vpn, $state) = ($1, $2, $3);
# TODO Normalize $time here if you wish
if ($state eq 'Primary authentication successful') {
$starts{$vpn} = $time;
} elsif (defined(my $start = delete $starts{$time})) {
# TODO Process other information needed from the line and output one line...
# TODO Also you can use the normalized ($time - $start) for your duration if it isn't available on the rest of the line.
} else {
warn "No 'Primary authentication successful' found for: $vpn\n";
如果你能用一个大规模的正则表达式进行,其余的简单比较,它会很快。 当然,在这里提高效率真的至关重要吗?
答案 2 :(得分:0)
为了编码效率,我会使用正则表达式来隔离每种类型的行所需的字段。 (我假设报告需要每天生成几次,执行速度不是问题。)
我会使用散列哈希作为我的数据结构。第一个哈希的键是userIds。 first的值是对第二个哈希的引用。第二个哈希的键是操作(authentication,logout,sessionClose,timeOut,maxSession等)。该值将是操作的时间戳。 (这会将特定用户的所有数据整合到一个数据结构中。我还假设计算机中有足够的RAM来处理内存中的所有数据。我还假设只有操作的时间戳需要。)
答案 3 :(得分:0)
use warnings;
use strict;
#This script convert the specified log file to a report showing each user's ID, DHCP Address, Logon time,
#Logout time, Timeout time, and Maxtimout time.
#Arrays needed for script
my @fields;
my @user;
my @dhcp;
my @login;
my @logout;
my @close;
my @timeout;
my @maxtimeout;
#Scalars needed for script
my $localtime = localtime();
my $input = '/home/user/bin/Temp/log.txt';
my $output = '>/home/user/bin/Temp/vpnreport.txt';
my $line;
my $fields;
my $userid;
my $jdate;
my $jtime;
my $dhcpaddr;
my $srcaddr;
my $sessionid;
my $sessiondur;
my $lastacctime;
my $lastaccdate;
my $bytesr;
my $bytesw;
my $timestamp;
my $maxrow = 0;
my $currow = 0;
my $i = 0;
#Open the log file
open (VPNLOG, $input) or die "Unable to open the input file:$!\n";
#Open the file(s) to be written to in clobber mode
open (VPNREPORT, $output) or die "Unable to open the output file:$!\n";
#Setup to while loop to process each line
while ($line = <VPNLOG>) {
chomp $line; #Remove the line breaks
#Strip the log's timestamp and IP
$line =~ s/.*Juniper:\s(.*)$/$1/;
#If line contains "Administrators" or "(Admin Users)" ignore it and move on to the next line
unless ($line =~ m/Administrators|(Admin Users)|System()/) {
#Split the line into the @fields array on every " " encountered
@fields = split (/ /, $line);
$jdate = $fields[0]; #Juniper datestamp
$jdate =~ s/-//g; #Remove any occurance of "-" from the date stamp
$jtime = $fields[1]; #Juniper timestamp
$userid = $fields[6]; #User ID
$userid =~ s/XXXXXXX.|\(.*\)\[(.*)\]//g; #Remove the "XXXXXXX\" preceding the username and the "(Realm)[Role ]"
#trailing the username
#Normalize and recombine jtime and jdate here:
$timestamp = "$jdate $jtime";
#Check to see if line contains string "VPN Tunneling: Session started for user"
if ($line =~ m/VPN Tunneling: Session started for user/) {
$dhcpaddr = $fields[17]; #Destination IP address
$dhcpaddr =~ s/,//g; #Remove "," trailing the IP address
$user[$maxrow] = $userid;
$dhcp[$maxrow] = $dhcpaddr;
$login[$maxrow] = $timestamp;
$logout[$maxrow] = "--";
$close[$maxrow] = "--";
$timeout[$maxrow] = "--";
$maxtimeout[$maxrow] = "--";
elsif ($line =~m/Logout/) {
$dhcpaddr = $fields[10]; #DHCP IP address
$sessionid = $fields[11]; #Session ID
$sessionid =~ s/\(session:|\)//g; #Remove the "(session:" and ")" from the session ID
for ($currow = $maxrow; $currow >= 1; $currow--) {
if ($user[$currow] eq $userid and $logout[$currow] eq "--") {
$logout[$currow] = $timestamp;
elsif ($line =~m/Closed connection/) {
$dhcpaddr = $fields[11]; #DHCP IP Address
$sessiondur = $fields[13]; #Duration of session in seconds
$bytesr = $fields[16]; #Bytes read
$bytesw = $fields[20]; #Bytes written
for ($currow = $maxrow; $currow >= 1; $currow--) {
if ($user[$currow] eq $userid and $close[$currow] eq "--") {
$close[$currow] = $timestamp;
elsif ($line =~m/Session timed out/) {
$sessionid = $fields[13]; #Session ID
$sessionid =~ s/\(session:|\)//g; #Remove the "(session:" and ")" from the session ID
$lastacctime = $fields[20]; #Last accessed time
$lastaccdate = $fields[21]; #Last accessed date
$lastaccdate =~ s/\).//g; #Remove the ")" from the last access date
for ($currow = $maxrow; $currow >= 1; $currow--) {
if ($user[$currow] eq $userid and $timeout[$currow] eq "--") {
$timeout[$currow] = $timestamp;
elsif ($line =~m/Max session timeout/) {
$sessionid = $fields[13]; #Session ID
$sessionid =~ s/\(session:|\).//g; #Remove the "(session:" and ")" from the session ID
for ($currow = $maxrow; $currow >= 1; $currow--) {
if ($user[$currow] eq $userid and $maxtimeout[$currow] eq "--") {
$maxtimeout[$currow] = $timestamp;
#Define the format then output file(s) using printf
#Print the Column headers: UserID, Logon, Logout, Timeout, Maxtimout, Close, Duration
printf VPNREPORT ("%-12s %-12s %-18s %-18s %-18s %-18s %-18s\n", "UserID", "DHCP", "Logon ", "Logout", "Timeout", "Maxtimout", "Close stamp");
print VPNREPORT "------------------------------------------------------------------------------------------- ----------------------------\n";
#Newest record at top of report
#for ($i = $maxrow; $i >= 1; $i--) {
#Oldest record at top of report
for ($i = 0; $i <= $maxrow; $i++) {
printf VPNREPORT ("%-12s %-12s %-18s %-18s %-18s %-18s %-18s\n", $user[$i], $dhcp[$i], $login[$i], $logout[$ i], $timeout[$i], $maxtimeout[$i], $close[$i]);
#Close the input and output files
close (VPNLOG);
close (VPNREPORT);