我想根据时间范围使用shell脚本(bash)从日志文件中提取信息。日志文件中的一行如下所示:
172.16.0.3 - - [31/Mar/2002:19:30:41 +0200] "GET / HTTP/1.1" 200 123 "" "Mozilla/5.0 (compatible; Konqueror/2.2.2-2; Linux)"
我想提取特定数据的间隔。例如,我只需要查看上次记录的数据中最后X分钟或X天前发生的事件。我是shell脚本的新手,但我尝试使用grep命令。
答案 0 :(得分:38)
您可以使用sed
。例如:
$ sed -n '/Feb 23 13:55/,/Feb 23 14:00/p' /var/log/mail.log
Feb 23 13:55:01 messagerie postfix/smtpd[20964]: connect from localhost[127.0.0.1]
Feb 23 13:55:01 messagerie postfix/smtpd[20964]: lost connection after CONNECT from localhost[127.0.0.1]
Feb 23 13:55:01 messagerie postfix/smtpd[20964]: disconnect from localhost[127.0.0.1]
Feb 23 13:55:01 messagerie pop3d: Connection, ip=[::ffff:127.0.0.1]
...
-n
开关告诉sed不输出它读取的文件的每一行(默认行为)。
正则表达式之后的最后一个p
告诉它打印与前一个表达式匹配的行。
表达式'/pattern1/,/pattern2/'
将打印第一个模式和第二个模式之间的所有内容。在这种情况下,它将打印它在字符串Feb 23 13:55
和字符串Feb 23 14:00
之间找到的每一行。
答案 1 :(得分:28)
使用grep和正则表达式,例如,如果您需要4分钟的日志间隔:
grep "31/Mar/2002:19:3[1-5]" logfile
将在2002年3月31日19:31和19:35之间返回所有日志行。 假设您需要从2011年9月27日开始的最后5天,您可以使用以下内容:
grep "2[3-7]/Sep/2011" logfile
答案 2 :(得分:7)
让我们看一个示例文件(名为 logFile ),我做得有点短。 比如,你想在这个文件中获得最后5分钟的登录信息:
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
### lines below are what you want (5 mins till the last record)
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:30:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:30:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:30:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:30:41 +0200] "GET
这是解决方案:
# this variable you could customize, important is convert to seconds.
# e.g 5days=$((5*24*3600))
x=$((5*60)) #here we take 5 mins as example
# this line get the timestamp in seconds of last line of your logfile
last=$(tail -n1 logFile|awk -F'[][]' '{ gsub(/\//," ",$2); sub(/:/," ",$2); "date +%s -d \""$2"\""|getline d; print d;}' )
#this awk will give you lines you needs:
awk -F'[][]' -v last=$last -v x=$x '{ gsub(/\//," ",$2); sub(/:/," ",$2); "date +%s -d \""$2"\""|getline d; if (last-d<=x)print $0 }' logFile
输出
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:30:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:30:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:30:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:30:41 +0200 "GET
修改强>
您可能会注意到在输出中[和]消失了。如果您确实需要它们,可以更改上一个awk行print $0
- &gt; print $1 "[" $2 "]" $3
答案 3 :(得分:3)
我使用此命令查找特定事件“DHCPACK
”的最后5分钟日志,请尝试以下操作:
$ grep "DHCPACK" /var/log/messages | grep "$(date +%h\ %d) [$(date --date='5 min ago' %H)-$(date +%H)]:*:*"
答案 4 :(得分:-1)
您可以使用它来获取当前和日志时间:
#!/bin/bash
log="log_file_name"
while read line
do
current_hours=`date | awk 'BEGIN{FS="[ :]+"}; {print $4}'`
current_minutes=`date | awk 'BEGIN{FS="[ :]+"}; {print $5}'`
current_seconds=`date | awk 'BEGIN{FS="[ :]+"}; {print $6}'`
log_file_hours=`echo $line | awk 'BEGIN{FS="[ [/:]+"}; {print $7}'`
log_file_minutes=`echo $line | awk 'BEGIN{FS="[ [/:]+"}; {print $8}'`
log_file_seconds=`echo $line | awk 'BEGIN{FS="[ [/:]+"}; {print $9}'`
done < $log
并比较log_file_*
和current_*
变量。