我正在尝试创建一个从日志文件输出特定时间范围的python脚本(类似于下面列出的sed命令):
sed -n '/2017-01-26 18:00/ , /2017-01-26 18:02/p' /logfile.log
2017-01-26 18:00:00 2017-01-26 18:01:01 2017-01-26 18:01:02 2017-01-26 18:01:09 2017-01-26 18:01:09 2017-01-26 18:01:11 2017-01-26 18:02:01
我的python脚本正在搜索固定的字符串,而不是像上面的sed命令(我怀疑我做错了什么,但我找不到错误 - 请检查下面的代码):
请指出我应该更改代码的位置,并建议代码增强。谢谢!
#!/usr/bin/python
import datetime, time, os, sys, re
from datetime import timedelta
counter = 0
avgtime = 0
now = datetime.datetime.utcnow()
pasttime = now - datetime.timedelta(minutes=5)
timestamp = now.strftime("%y%m%d")
fiveago = now - timedelta(minutes=5,seconds=now.second)
current = now.strftime("%Y-%m-%d %H:%M")
pasttime = fiveago.strftime("%Y-%m-%d %H:%M")
pattern = str(current + "|" + pasttime)
f = open('/logs/' + sys.argv[1] + '/' + 'u_ex' + timestamp + '.log', 'r')
for line in f:
if "POST" in line:
if re.search(pattern, line, re.IGNORECASE):
date = line.split(' ')[1]
time = line.split(' ')[14]
avgtime += int(time)
counter += 1
print(date,time)
f.close()
print(pattern)
print("Total amount of time: ",counter)
print("Total scan time: ",avgtime)
print("Average scan time: ",avgtime / counter)
答案 0 :(得分:0)
我没有看到问题是什么,但你要求sed等同于你的命令,所以这里是精确的转换为python:
import sys, re
use = False
for line in open('/logfile.log'):
if re.search('2017-01-26 18:00', line): use = True
if use: sys.stdout.write(line)
if re.search('2017-01-26 18:02', line): use = False
答案 1 :(得分:0)
IIUC,您需要通过时间戳之间的日志来确定。
import datetime, time, os, sys, re
from datetime import timedelta
counter = 0
avgtime = 0
now = datetime.datetime.utcnow()
pasttime = now - datetime.timedelta(minutes=100000)
timestamp = now.strftime("%y%m%d")
fiveago = now - timedelta(minutes=5,seconds=now.second)
current = now.strftime("%Y-%m-%d %H:%M")
pasttime = fiveago.strftime("%Y-%m-%d %H:%M")
pattern = str(current + "|" + pasttime)
print "Start time: ", pasttime ,"End time: ",current ,"\n\n"
filename ='/logs/' + sys.argv[1] + '/' + 'u_ex' + timestamp + '.log'
with open(filename, 'r') as f:
contents = f.readlines()
for line in contents:
if "POST" in line:
date = line.split(' ')[1]
time = line.split(' ')[14]
logdatetime=date+" "+time
if logdatetime <= current and logdatetime >= pasttime:
print "yes, within the interval : " ,logdatetime
输出
Start time: 2017-01-26 20:23 End time: 2017-01-26 20:28
yes, within the interval : 2017-01-26 20:23:20
yes, within the interval : 2017-01-26 20:23:01
yes, within the interval : 2017-01-26 20:23:02
用于此
的输入POST 2017-01-26 20:23:20 XX
POST 2017-01-26 20:23:01 XC
POST 2017-01-26 20:23:02 CV
POST 2017-01-26 20:20:09 DAF
POST 2017-01-26 20:20:09 fASF
POST 2017-01-26 20:20:11 Sfas
POST 2017-01-26 20:20:01 fsAf
POST 2017-01-26 20:20:02 asf
POST 2017-01-26 20:20:03 asf
答案 2 :(得分:0)
您的解决方案的问题在于您只查找两个&#34;边缘时间&#34;。在您的3分钟时间范围示例中,这是18:00
和18:02
。
sed
命令的作用是:
sed -n '/2017-01-26 18:00/ , /2017-01-26 18:02/p' /logfile.log
-n
)2017-01-26 18:00
时,它就会开始打印所有行2017-01-26 18:02
时,它就会停止打印在您的示例中,您的正则表达式模式是:
2017-01-26 18:00|2017-01-26 18:02
只会找到 18:00 或 18:02。所以,你能做的就是其中之一:
pimp你的正则表达式,所以它也搜索中间的时间:
pattern = "|".join([(now-timedelta(minutes=i)).strftime("%Y-%m-%d %H:%M") for i in range(6)])
这将产生例如:
'2016-01-26 18:00|2016-01-26 17:59|2016-01-26 17:58|2016-01-26 17:57|2016-01-26 17:56|2016-01-26 17:55'