解析自定义日志文件python

时间:2017-03-27 14:17:48

标签: python python-2.7 python-3.x

HI有大型自定义iis日志文件,我需要解析并以json格式存储它。

 ----------------------------------------
3/27/2017 5:32:54 AM host1

Message: Membership Service Initialization Time: 2296 milliseconds

Severity: Error

----------------------------------------
----------------------------------------
3/27/2017 5:33:00 AM host1

Message: <TraceRecord xmlns="http://schemas.microsoft.com/2004/10/E2ETraceEvent/TraceRecord" Severity="Warning"><TraceIdentifier>http://msdn.microsoft.com/en-US/library/System.ServiceModel.EvaluationContextNotFound.aspx</TraceIdentifier><Description>Configuration evaluation context not found.</Description><AppDomain>/LM/W3SVC/2/ROOT/services.membership2.0-1-131350663703668815</AppDomain></TraceRecord>

Severity: Warning

----------------------------------------
----------------------------------------
3/27/2017 5:33:00 AM host2

Message: <TraceRecord xmlns="http://schemas.microsoft.com/2004/10/E2ETraceEvent/TraceRecord" Severity="Warning"><TraceIdentifier>http://msdn.microsoft.com/en-</TraceRecord>

Severity: Warning

----------------------------------------
----------------------------------------
3/27/2017 5:33:01 AM host2

Message: Oracle.DataAccess.Client.OracleException ORA-06550: line 1, column 45:
PLS-00302: component 'SP_GET_MEMBER_AND_ROLES' must be declared
ORA-06550: line 1, column 7:


Severity: Error

----------------------------------------
----------------------------------------
3/27/2017 5:45:26 AM host

Message: Membership Service Initialization Time: 1742 milliseconds

Severity: Error

----------------------------------------
----------------------------------------

基本上要创建-json fomat

{data:'',time:'',host:'',err_msg:'',servertiy:''}

如何破解文件?

1 个答案:

答案 0 :(得分:0)

下面的代码逻辑将文件分成字典列表,其中每个字典都可以转换为json对象。

lst = []

with open('input.txt') as f:

    d = {}
    for line in f.readlines():
        if '---------' in line or '\n' == line:
            continue
        line = line.rstrip('\n')

        if line.startswith('Message:'):
            d['err_msg'] = line[len('Message: '):]
            continue

        if line.startswith('Severity:'):
            d['severity'] = line[len('Severity: '):]
            lst.append(d)
            d = {}
            continue

        line = line.split()
        d['date'] = line[0]
        d['time'] = line[1] + ' ' + line[2]
        d['host'] = line[3]


for l in lst:
    print l

输出

{'date':'3/27/2017','host':'host1','err_msg':'会员服务初始化时间:2296毫秒','严重性':'错误','时间': '上午5:32:54'}

{'date':'3/27/2017','host':'host1','err_msg':'http://msdn.microsoft.com/en-US/library/System.ServiceModel.EvaluationContextNotFound.aspx找不到配置评估上下文./LM/W3SVC/2/ROOT/services .membership2.0-1-131350663703668815','严重':'警告','时间':'上午5:33:00'}

{'date':'3/27/2017','host':'host2','err_msg':'http://msdn.microsoft.com/en-','严重':'警告','时间':'5 :33:00'}

{'date':'3/27/2017','host':'host2','err_msg':“Oracle.DataAccess.Client.OracleException ORA-06550:第1行,第45列:PLS-00302:组件'SP_GET_MEMBER_AND_ROLES'必须声明为ORA-06550:第1行,第7列:“,'严重性':'错误','时间':'上午5:33:01'}

{'date':'3/27/2017','host':'host','err_msg':'会员服务初始化时间:1742毫秒','严重性':'错误','时间': '上午5:45:26'}

注意:假设邮件是一行。