Cloudwatch警报进入“ ALARM”状态时尝试触发SSM:Run Command操作

时间:2019-06-25 17:13:22

标签: json python-3.x amazon-web-services aws-lambda amazon-cloudwatch

当我的cloudwatch警报进入“ ALARM”状态时,尝试触发SSM:Run Command操作。我正在尝试通过Cloudwatch Rule-Event模式并通过获取AWS Cloud Trail API日志来实现这一目标。

尝试将监视和事件名称设置为“ DescribeAlarms”,将状态值设置为“ ALARM”。刚刚尝试添加我的SNS主题(而不是SSM:RunCommand)以验证它进入ALARM状态但没有运气时会触发向我发送电子邮件。

```{
  "source": [
    "aws.monitoring"
  ],
  "detail-type": [
    "AWS API Call via CloudTrail"
  ],
  "detail": {
    "eventSource": [
      "monitoring.amazonaws.com"
    ],
    "eventName": [
      "DescribeAlarms"
    ],
    "requestParameters": {
      "stateValue": [
        "ALARM"
      ]
    }
  }
}```

我希望在这种情况下得到满足,在这里-任何进入ALARM状态的警报都应命中目标-这是我的SNS主题。

  

更新:

感谢@John的澄清。如您所建议,我正在尝试使用SNS-> Lambda-> SSM Run Command。但是我无法从SNS事件中获取实例ID。它说[Records-Keyerror]。阅读您的一些帖子,并尝试全部。但无法通过。你能帮忙吗?

Received event: {
"Records": [
{
"EventSource": "aws:sns",
"EventVersion": "1.0",
"EventSubscriptionArn": "arn:aws:sns:eu-west-1:XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
"Sns": {
"Type": "Notification",
"MessageId": "********************c",
"TopicArn": "arn:aws:sns:eu-west-1:*******************************",
"Subject": "ALARM: \"!!! Critical Alert !!! Disk Space is going to be full in Automation Testing\" in EU (Ireland)",
"Message": "{\"AlarmName\":\"!!! Critical Alert !!! Disk Space is going to be full in Automation Testing\",\"AlarmDescription\":\"Disk Space is going to be full in Automation Testing\",\"AWSAccountId\":\"***********\",\"NewStateValue\":\"ALARM\",\"NewStateReason\":\"Threshold Crossed: 1 out of the last 1 datapoints [**********] was less than or equal to the threshold (70.0) (minimum 1 datapoint for OK -> ALARM transition).\",\"StateChangeTime\":\"******************\",\"Region\":\"EU (Ireland)\",\"OldStateValue\":\"OK\",\"Trigger\":{\"MetricName\":\"disk_used_percent\",\"Namespace\":\"CWAgent\",\"StatisticType\":\"Statistic\",\"Statistic\":\"AVERAGE\",\"Unit\":null,\"Dimensions\":[{\"value\":\"/\",\"name\":\"path\"},{\"value\":\"i-****************\",\"name\":\"InstanceId\"},{\"value\":\"ami-****************\",\"name\":\"ImageId\"},{\"value\":\"t2.micro\",\"name\":\"InstanceType\"},{\"value\":\"xvda1\",\"name\":\"device\"},{\"value\":\"xfs\",\"name\":\"fstype\"}],\"Period\":300,\"EvaluationPeriods\":1,\"ComparisonOperator\":\"LessThanOrEqualToThreshold\",\"Threshold\":70.0,\"TreatMissingData\":\"- TreatMissingData: missing\",\"EvaluateLowSampleCountPercentile\":\"\"}}",
"Timestamp": "2019-06-29T19:23:43.829Z",
"SignatureVersion": "1",
"Signature": "XXXXXXXXXXXX",
"SigningCertUrl": "https://sns.eu-west-1.amazonaws.com/XXXXXXXX.pem",
"UnsubscribeUrl": "https://sns.eu-west-1.amazonaws.com/?Action=Unsubscribe&SubscriptionArn=arn:aws:sns:eu-west-1XXXXXXXXXXXXXXXXXXXXX",
"MessageAttributes":
{}

}
}
]
}

以下是我的Lambda函数:

from __future__ import print_function
import boto3
import json
ssm = boto3.client('ssm')
ec2 = boto3.resource('ec2')

print('Loading function')

def lambda_handler(event, context):
    # Dump the event to the log, for debugging purposes
    print("Received event: " + json.dumps(event, indent=2))

    message = event['Records']['Sns']['Message']
    msg = json.loads(message)
    InstanceId = msg['InstanceId']['value']
    print ("Instance: %s" % InstanceId)

2 个答案:

答案 0 :(得分:0)

这可能行不通,因为 AWS CloudTrail仅捕获对AWS的API调用,并且CloudWatch警报进入 ALARM 状态是内部更改,<不是由API调用引起的。

我建议:

  • Amazon CloudWatch警报触发AWS Lambda函数
  • Lambda函数调用SSM运行命令(例如send_command()

答案 1 :(得分:0)

可以通过以下更改来实现:

from __future__ import print_function
import boto3
import json
ssm = boto3.client('ssm')
ec2 = boto3.resource('ec2')

print('Loading function')

def lambda_handler(event, context):
    # Dump the event to the log, for debugging purposes
    print("Received event: " + json.dumps(event, indent=2))
    message = json.loads(event['Records'][0]['Sns']['Message'])
    instance_id = message['Trigger']['Dimensions'][1]['value']
    print ("Instance: %s" % instance_id)