HI有大型自定义iis日志文件,我需要解析并以json格式存储它。
----------------------------------------
3/27/2017 5:32:54 AM host1
Message: Membership Service Initialization Time: 2296 milliseconds
Severity: Error
----------------------------------------
----------------------------------------
3/27/2017 5:33:00 AM host1
Message: <TraceRecord xmlns="http://schemas.microsoft.com/2004/10/E2ETraceEvent/TraceRecord" Severity="Warning"><TraceIdentifier>http://msdn.microsoft.com/en-US/library/System.ServiceModel.EvaluationContextNotFound.aspx</TraceIdentifier><Description>Configuration evaluation context not found.</Description><AppDomain>/LM/W3SVC/2/ROOT/services.membership2.0-1-131350663703668815</AppDomain></TraceRecord>
Severity: Warning
----------------------------------------
----------------------------------------
3/27/2017 5:33:00 AM host2
Message: <TraceRecord xmlns="http://schemas.microsoft.com/2004/10/E2ETraceEvent/TraceRecord" Severity="Warning"><TraceIdentifier>http://msdn.microsoft.com/en-</TraceRecord>
Severity: Warning
----------------------------------------
----------------------------------------
3/27/2017 5:33:01 AM host2
Message: Oracle.DataAccess.Client.OracleException ORA-06550: line 1, column 45:
PLS-00302: component 'SP_GET_MEMBER_AND_ROLES' must be declared
ORA-06550: line 1, column 7:
Severity: Error
----------------------------------------
----------------------------------------
3/27/2017 5:45:26 AM host
Message: Membership Service Initialization Time: 1742 milliseconds
Severity: Error
----------------------------------------
----------------------------------------
基本上要创建-json fomat
{data:'',time:'',host:'',err_msg:'',servertiy:''}
如何破解文件?
答案 0 :(得分:0)
下面的代码逻辑将文件分成字典列表,其中每个字典都可以转换为json对象。
lst = []
with open('input.txt') as f:
d = {}
for line in f.readlines():
if '---------' in line or '\n' == line:
continue
line = line.rstrip('\n')
if line.startswith('Message:'):
d['err_msg'] = line[len('Message: '):]
continue
if line.startswith('Severity:'):
d['severity'] = line[len('Severity: '):]
lst.append(d)
d = {}
continue
line = line.split()
d['date'] = line[0]
d['time'] = line[1] + ' ' + line[2]
d['host'] = line[3]
for l in lst:
print l
输出
{'date':'3/27/2017','host':'host1','err_msg':'会员服务初始化时间:2296毫秒','严重性':'错误','时间': '上午5:32:54'}
{'date':'3/27/2017','host':'host1','err_msg':'http://msdn.microsoft.com/en-US/library/System.ServiceModel.EvaluationContextNotFound.aspx找不到配置评估上下文./LM/W3SVC/2/ROOT/services .membership2.0-1-131350663703668815','严重':'警告','时间':'上午5:33:00'}
{'date':'3/27/2017','host':'host2','err_msg':'http://msdn.microsoft.com/en-','严重':'警告','时间':'5 :33:00'}
{'date':'3/27/2017','host':'host2','err_msg':“Oracle.DataAccess.Client.OracleException ORA-06550:第1行,第45列:PLS-00302:组件'SP_GET_MEMBER_AND_ROLES'必须声明为ORA-06550:第1行,第7列:“,'严重性':'错误','时间':'上午5:33:01'}
{'date':'3/27/2017','host':'host','err_msg':'会员服务初始化时间:1742毫秒','严重性':'错误','时间': '上午5:45:26'}
注意:假设邮件是一行。