在Python 2.7中,如何提取和打印日期,时间等

时间:2016-05-02 09:36:17

标签: python python-2.7 logfiles

这是代码:

#!/usr/bin/env python

#Import the datetime 
from datetime import datetime
import re

#Create two datetime object for limit 1 and limit 2 as dt1 and dt2 respectively
dt1 = datetime.strptime("01:00:00","%H:%M:%S").time()

dt2 = datetime.strptime("04:59:59","%H:%M:%S").time()

#Create a compiler for regular expression
init_re = re.compile(r'(INIT)')

time_re = re.compile(r'(\d+:\d+:\d+)')

# read line from test.log file
for line in open("test.log", "r"):

        match = time_re.search(line) #Search time format for each line
        if match:
             matchtime = match.group(1)
             dt_match = datetime.strptime( matchtime,    '%H:%M:%S').time()       
              #Time formmat match
              if dt_match >= dt1 and dt_match <= dt2:
                  match1 = init_re.search(line) #search INIT format
                  if match1:
                      matchinit = match1.group(0)
                      print match.string.strip()

以下是日志文件的部分内容:

  

2015-12-15 00:51:01,904 INFO restser.py 113 [INIT] [netkv_restser:peek] [req_id:f0aa7ab5-6192-4231-93cd-82a53936a072] [要求:{u&#39; key_space&#39 ;:u&#39; martech_user_index&#39;,u&#39; table_name&#39 ;: u&#39; nettopic&#39;,u&#39; key&#39;:9569}]

我想要这样的输出:

[Date: ] [Time: ] [INIT] [netkv_restser: ] [req_id: ] 

示例:

[Date:2015-12-15 ] [Time:00:51:01,904 ] [INIT] [netkv_restser:peek ] [req_id:f0aa7ab5-6192-4231-93cd-82a53936a072 ] 

注意:如果您足够好以编辑代码,请善于提供解决方案。我不想听起来很粗鲁,但它让我感到烦恼。

注意:我使用的是Python 2.7.6。

2 个答案:

答案 0 :(得分:0)

您可以使用正则表达式来解析每个日志行。这取决于日志文件结构的固定方式,但这适用于您提供的输入行。

template = '[Date:%s] [Time:%s] [%s] [netkv_restser:%s] [req_id:%s]'
details = re.search('(\d{4}-\d{2}-\d{2}) (\d{2}:\d{2}:\d{2},\d{3}).*\[(INIT)\].*\[netkv_restser: *(.*?)\].*\[req_id: *(.*?)\]',s).groups()
output = template % details

输出

'[Date:2015-12-15] [Time:00:51:01,904] [INIT] [netkv_restser:peek] [req_id:f0aa7ab5-6192-4231-93cd-82a53936a072]'

当然你可以compile正则表达式并包含在文件行的循环中

pattern = re.compile('(\d{4}-\d{2}-\d{2}) (\d{2}:\d{2}:\d{2},\d{3}).*\[(INIT)\].*\[netkv_restser: *(.*?)\].*\[req_id: *(.*?)\]')
template = '[Date:%s] [Time:%s] [%s] [netkv_restser:%s] [req_id:%s]'
for line in open("test.log", "r"):
    details = re.search(pattern, line).groups()
    print template % details

答案 1 :(得分:0)

我发帖回答,因为它很好!! 特别感谢弗朗西斯科先生和埃布拉希姆先生。

!/ usr / bin / env python

导入日期时间

从datetime导入日期时间 导入重新

为限制1创建两个日期时间对象,并将限制2限制为分别为dt1和dt2

dt1 = datetime.strptime(“00:00:00”,“%H:%M:%S”)。time()

dt2 = datetime.strptime(“03:59:59”,“%H:%M:%S”)。time()

为正则表达式

创建编译器

time_re = re.compile(r'(\ d +:\\ + +:\ d +)')

pattern = re.compile('(\ d {4} - \ d {2} - \ d {2})(\ d {2}:\ d {2}:\ d {2},\ d {3})。 [(INIT)]。 [netkv_restser:(。?)]。* [req_id:(。?)] “)

设置输出

的模板

template ='[日期:%s] [时间:%s] [%s] [netkv_restser:%s] [req_id:%s]'

for open in line(“sample.log”,“r”):

    match = time_re.search(line) #Search time format for each line
    matchtime = match.group(1)
    #Time format match
    dt_match = datetime.strptime(matchtime, '%H:%M:%S').time() 
    if dt_match >= dt1 and dt_match <= dt2: #time limit is set
        #print value
        detail = re.match(pattern, line)
        if detail:
            print template % detail.groups()