Appreciate your help in advance.
In my scenario - Cloudwatch multiline logs needs to be shipped to elasticsearch service. ECS--awslog->Cloudwatch---using lambda--> ES Domain (Basic flow though very open to change how data is shipped from CW to ES )
I was able to solve multi-line issue using multi_line_start_pattern BUT The main issue I am experiencing now - is my logs have ODL format (following format)
[yyyy-mm-ddThh:mm:ss.SSS-Z][ProductName-Version][Log Level]
[Message ID][LoggerName][Key Value Pairs][[
Message]]
AND I will like to parse and tokenize log events before storing in ES (vs the complete log line ). For example:
[2018-05-31T11:08:49.148-0400] [glassfish 4.1] [INFO] [] [] [tid: _ThreadID=43 _ThreadName=Thread-8] [timeMillis: 1527692929148] [levelValue: 800] [[
[] INFO : (DummyApplicationFunctionJPADAO) EntityManagerFactory located under resource lookup name [null], resource name=AuthorizationPU]]
Needs to be parsed and tokenize using format
timestamp 2018-05-31T11:08:49.148-0400
ProductName-Version glassfish 4.1
LogLevel INFO
MessageID
LoggerName
KeyValuePairs tid: _ThreadID=43 _ThreadName=Thread-8
Message [] INFO : (DummyApplicationFunctionJPADAO)
EntityManagerFactorylocated under resource lookup name
[null], resource name=AuthorizationPU
In above Key Value pairs repeat and are variable - for simplicity I can store all as one long string.
As far as what I gathered about Cloudwatch - It seems Subscription Filter Pattern reg ex support is very limited really not sure how to fit the above pattern. For lambda function that pushes the data to ES have not seen AWS doc or examples that support lambda as means to parse and push for ES.
Will appreciate if someone can please guide what/where will be best option to parse CW logs before it gets into ES => Subscription Filter -Pattern vs in lambda function or any other way.
Thank you .
答案 0 :(得分:0)
根据我的最佳选择,就是您的建议,一个CloudWatch日志触发了lambda,它将记录的数据重新格式化为ES首选格式,然后将其发布到ES。
您需要将此lambda订阅到CloudWatch日志。您可以在lambda控制台或cloudwatch控制台(https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/Subscriptions.html)上进行此操作。
lambda的event
有效负载为:{ "awslogs": { "data": "encoded-logs" } }
。其中encoded-logs
是gzip JSON的Base64编码。
例如,示例事件(https://docs.aws.amazon.com/lambda/latest/dg/eventsources.html#eventsources-cloudwatch-logs)可以在节点中解码,例如,使用:
const zlib = require('zlib');
const data = event.awslogs.data;
const gzipped = Buffer.from(data, 'base64');
const json = zlib.gunzipSync(gzipped);
const logs = JSON.parse(json);
console.log(logs);
/*
{ messageType: 'DATA_MESSAGE',
owner: '123456789123',
logGroup: 'testLogGroup',
logStream: 'testLogStream',
subscriptionFilters: [ 'testFilter' ],
logEvents:
[ { id: 'eventId1',
timestamp: 1440442987000,
message: '[ERROR] First test message' },
{ id: 'eventId2',
timestamp: 1440442987001,
message: '[ERROR] Second test message' } ] }
*/
根据您所概述的内容,您将想要提取logEvents
数组,并将其解析为字符串数组。如果您需要它,我也很乐意为此提供帮助(但我需要知道您正在使用哪种语言编写lambda-有用于标记ODL的库-希望它不太难)。>
此时,您可以将这些新记录直接POST
放入您的AWS ES域中。 S3-to-ES指南有点类似地概述了如何在python中执行此操作:https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/es-aws-integrations.html#es-aws-integrations-s3-lambda-es
您可以在以下位置找到一个完整的lambda示例(由他人执行):https://github.com/blueimp/aws-lambda/tree/master/cloudwatch-logs-to-elastic-cloud