Python正则表达式解决方案

时间:2018-05-17 10:01:00

标签: python regex

我正在尝试使用 python代码实际日志集中提取必需字符串。但是如当前输出所示,我正在获取任何建议,我们非常感谢。提前谢谢!

实际记录集

[{'logStreamName': '2018/05/15/[$LATEST]e9c838560b4a43a8beab55c09b8cff61', 'timestamp': 1526397147847, 'message': 'START RequestId: 614b56e6-5852-11e8-a3d4-850c17ee5197 Version: $LATEST\n', 'ingestionTime': 1526397148010, 'eventId': '34039793865899822940373036294062893091972474685247389696'}, {'logStreamName': '2018/05/15/[$LATEST]e9c838560b4a43a8beab55c09b8cff61', 'timestamp': 1526397148227, 'message': "Ec2 Instances which are stopped:  Instance ID:  i-006690f105487930f Instance state:  {'Code': 80, 'Name': 'stopped'} Instance type:  t2.micro\n", 'ingestionTime': 1526397148215, 'eventId': '34039793874374106115814673088093836532895101688702042112'}]

Python代码:

regex1 = r"Ec2 Instances.*micro"
Strres = str(logset) # Logset is a list which has logs
matches1 = re.findall(regex1,str(Strres))
print(matches1)

必填字符串

"Ec2 Instances which are stopped:  Instance ID:  i-006690f105487930f Instance state:  {'Code': 80, 'Name': 'stopped'} Instance type:  t2.micro" 

当我执行上面的代码时,我得到以下输出,不确定为什么在执行正则表达式操作后包含这么多斜杠。

当前输出

['Ec2 Instances which are stopped:  Instance ID:  i-0ab4e087422860879 Instance state:  {\'Code\': 80, \'Name\': \'stopped\'} Instance type:  t2.micro\
", "Ec2 Instances which are stopped:  Instance ID:  i-03849720b1537c31c Instance state:  {\'Code\': 80, \'Name\': \'stopped\'} Instance type:  t2.micro\
", "Ec2 Instances which are running:  Instance ID:  i-006690f105487930f Instance state:  {\'Code\': 16, \'Name\': \'running\'} Instance type:  t2.micro\
", \'END RequestId: 7fcacec8-59aa-11e8-9ce2-fbf81c0889df\
\', \'REPORT RequestId: 7fcacec8-59aa-11e8-9ce2-fbf81c0889df\\tDuration: 717.44 ms\\tBilled Duration: 800 ms \\tMemory Size: 128 MB\\tMax Memory Used: 39 MB\\t\
\', \'START RequestId: 27dc0e69-59ac-11e8-805d-7134bbe0f1d1 Version: $LATEST\
\', "Ec2 Instances which are stopped:  Instance ID:  i-006690f105487930f Instance state:  {\'Code\': 80, \'Name\': \'stopped\'} Instance type:  t2.micro\
", "Ec2 Instances which are stopped:  Instance ID:  i-0ab4e087422860879 Instance state:  {\'Code\': 80, \'Name\': \'stopped\'} Instance type:  t2.micro\
", "Ec2 Instances which are stopped:  Instance ID:  i-03849720b1537c31c Instance state:  {\'Code\': 80, \'Name\': \'stopped\'} Instance type:  t2.micro']

1 个答案:

答案 0 :(得分:1)

根据@zwer的建议,您可以将日志集视为dict列表:

import re
logset = [{'logStreamName': '2018/05/15/[$LATEST]e9c838560b4a43a8beab55c09b8cff61', 'timestamp': 1526397147847, 'message': 'START RequestId: 614b56e6-5852-11e8-a3d4-850c17ee5197 Version: $LATEST\n', 'ingestionTime': 1526397148010, 'eventId': '34039793865899822940373036294062893091972474685247389696'}, {'logStreamName': '2018/05/15/[$LATEST]e9c838560b4a43a8beab55c09b8cff61', 'timestamp': 1526397148227, 'message': "Ec2 Instances which are stopped:  Instance ID:  i-006690f105487930f Instance state:  {'Code': 80, 'Name': 'stopped'} Instance type:  t2.micro\n", 'ingestionTime': 1526397148215, 'eventId': '34039793874374106115814673088093836532895101688702042112'}]

regex = "Ec2 Instances.*micro"
res = [e["message"] for e in logset if re.match(regex, e["message"])]
print(res)

输出:

  

[“停止的Ec2实例:实例ID:i-006690f105487930f实例状态:{'代码':80,'名称':'已停止}}实例类型:t2.micro \ n”]