我正在尝试使用 python代码从实际日志集中提取必需字符串。但是如当前输出所示,我正在获取任何建议,我们非常感谢。提前谢谢!
实际记录集
[{'logStreamName': '2018/05/15/[$LATEST]e9c838560b4a43a8beab55c09b8cff61', 'timestamp': 1526397147847, 'message': 'START RequestId: 614b56e6-5852-11e8-a3d4-850c17ee5197 Version: $LATEST\n', 'ingestionTime': 1526397148010, 'eventId': '34039793865899822940373036294062893091972474685247389696'}, {'logStreamName': '2018/05/15/[$LATEST]e9c838560b4a43a8beab55c09b8cff61', 'timestamp': 1526397148227, 'message': "Ec2 Instances which are stopped: Instance ID: i-006690f105487930f Instance state: {'Code': 80, 'Name': 'stopped'} Instance type: t2.micro\n", 'ingestionTime': 1526397148215, 'eventId': '34039793874374106115814673088093836532895101688702042112'}]
Python代码:
regex1 = r"Ec2 Instances.*micro"
Strres = str(logset) # Logset is a list which has logs
matches1 = re.findall(regex1,str(Strres))
print(matches1)
必填字符串
"Ec2 Instances which are stopped: Instance ID: i-006690f105487930f Instance state: {'Code': 80, 'Name': 'stopped'} Instance type: t2.micro"
当我执行上面的代码时,我得到以下输出,不确定为什么在执行正则表达式操作后包含这么多斜杠。
当前输出
['Ec2 Instances which are stopped: Instance ID: i-0ab4e087422860879 Instance state: {\'Code\': 80, \'Name\': \'stopped\'} Instance type: t2.micro\
", "Ec2 Instances which are stopped: Instance ID: i-03849720b1537c31c Instance state: {\'Code\': 80, \'Name\': \'stopped\'} Instance type: t2.micro\
", "Ec2 Instances which are running: Instance ID: i-006690f105487930f Instance state: {\'Code\': 16, \'Name\': \'running\'} Instance type: t2.micro\
", \'END RequestId: 7fcacec8-59aa-11e8-9ce2-fbf81c0889df\
\', \'REPORT RequestId: 7fcacec8-59aa-11e8-9ce2-fbf81c0889df\\tDuration: 717.44 ms\\tBilled Duration: 800 ms \\tMemory Size: 128 MB\\tMax Memory Used: 39 MB\\t\
\', \'START RequestId: 27dc0e69-59ac-11e8-805d-7134bbe0f1d1 Version: $LATEST\
\', "Ec2 Instances which are stopped: Instance ID: i-006690f105487930f Instance state: {\'Code\': 80, \'Name\': \'stopped\'} Instance type: t2.micro\
", "Ec2 Instances which are stopped: Instance ID: i-0ab4e087422860879 Instance state: {\'Code\': 80, \'Name\': \'stopped\'} Instance type: t2.micro\
", "Ec2 Instances which are stopped: Instance ID: i-03849720b1537c31c Instance state: {\'Code\': 80, \'Name\': \'stopped\'} Instance type: t2.micro']
答案 0 :(得分:1)
根据@zwer的建议,您可以将日志集视为dict列表:
import re
logset = [{'logStreamName': '2018/05/15/[$LATEST]e9c838560b4a43a8beab55c09b8cff61', 'timestamp': 1526397147847, 'message': 'START RequestId: 614b56e6-5852-11e8-a3d4-850c17ee5197 Version: $LATEST\n', 'ingestionTime': 1526397148010, 'eventId': '34039793865899822940373036294062893091972474685247389696'}, {'logStreamName': '2018/05/15/[$LATEST]e9c838560b4a43a8beab55c09b8cff61', 'timestamp': 1526397148227, 'message': "Ec2 Instances which are stopped: Instance ID: i-006690f105487930f Instance state: {'Code': 80, 'Name': 'stopped'} Instance type: t2.micro\n", 'ingestionTime': 1526397148215, 'eventId': '34039793874374106115814673088093836532895101688702042112'}]
regex = "Ec2 Instances.*micro"
res = [e["message"] for e in logset if re.match(regex, e["message"])]
print(res)
输出:
[“停止的Ec2实例:实例ID:i-006690f105487930f实例状态:{'代码':80,'名称':'已停止}}实例类型:t2.micro \ n”]