我有一个警报,正在日志中寻找error
消息,它确实触发了警报状态。但是它不会重置,并保持为In Alarm
状态。我将警报操作作为SNS主题,这又触发了电子邮件。因此,基本上在出现第一个错误之后,我看不到任何后续电子邮件。以下模板配置出了什么问题?
"AppErrorMetric": {
"Type": "AWS::Logs::MetricFilter",
"Properties": {
"LogGroupName": {
"Ref": "AppServerLG"
},
"FilterPattern": "[error]",
"MetricTransformations": [
{
"MetricValue": "1",
"MetricNamespace": {
"Fn::Join": [
"",
[
{
"Ref": "ApplicationEndpoint"
},
"/metrics/AppError"
]
]
},
"MetricName": "AppError"
}
]
}
},
"AppErrorAlarm": {
"Type": "AWS::CloudWatch::Alarm",
"Properties": {
"ActionsEnabled": "true",
"AlarmName": {
"Fn::Join": [
"",
[
{
"Ref": "AppId"
},
",",
{
"Ref": "AppServerAG"
},
":",
"AppError",
",",
"MINOR"
]
]
},
"AlarmDescription": {
"Fn::Join": [
"",
[
"service is throwing error. Please check logs.",
{
"Ref": "AppServerAG"
},
"-",
{
"Ref": "AppId"
}
]
]
},
"MetricName": "AppError",
"Namespace": {
"Fn::Join": [
"",
[
{
"Ref": "ApplicationEndpoint"
},
"metrics/AppError"
]
]
},
"Statistic": "Sum",
"Period": "300",
"EvaluationPeriods": "1",
"Threshold": "1",
"AlarmActions": [{
"Fn::GetAtt": [
"VPCInfo",
"SNSTopic"
]
}],
"ComparisonOperator": "GreaterThanOrEqualToThreshold"
}
}
答案 0 :(得分:1)
您的问题是两个因素的组合:
TreatMissingData
为missing
。CloudWatch documentation about missing data说:
对于每个警报,您可以指定CloudWatch以处理丢失的数据 指向以下任意一项:
- notBreaching –缺少的数据点被视为“良好”并且在阈值之内,
- 违反–缺失的数据点被视为“不良”并违反阈值
- 忽略-保持当前警报状态
- 丢失–警报在评估是否更改状态时不会考虑缺少数据点
在您的警报配置中添加"TreatMissing": "notBreaching"
参数将导致CloudWatch将丢失的数据点视为未破坏,并将警报转换为OK:
"AppErrorAlarm": {
"Type": "AWS::CloudWatch::Alarm",
"Properties": {
"ActionsEnabled": "true",
"AlarmName": {
"Fn::Join": [
"",
[
{
"Ref": "AppId"
},
",",
{
"Ref": "AppServerAG"
},
":",
"AppError",
",",
"MINOR"
]
]
},
"AlarmDescription": {
"Fn::Join": [
"",
[
"service is throwing error. Please check logs.",
{
"Ref": "AppServerAG"
},
"-",
{
"Ref": "AppId"
}
]
]
},
"MetricName": "AppError",
"Namespace": {
"Fn::Join": [
"",
[
{
"Ref": "ApplicationEndpoint"
},
"metrics/AppError"
]
]
},
"Statistic": "Sum",
"Period": "300",
"EvaluationPeriods": "1",
"Threshold": "1",
"TreatMissingData": "notBreaching",
"AlarmActions": [{
"Fn::GetAtt": [
"VPCInfo",
"SNSTopic"
]
}],
"ComparisonOperator": "GreaterThanOrEqualToThreshold"
}
}