CloudWatch Alarm进入ALARM以获取单个数据点

时间:2018-01-30 05:14:17

标签: java amazon-web-services amazon-cloudwatch

如果平均性能超过5毫秒,我正在尝试创建cloudwatch ALARM。我已配置Amazon CloudWatch警报以检查平均值。但是,如果只有一个数据点超出阈值,警报就会进入警报状态。

public static void main(String[] args) {
   AWSExample aws = new AWSExample();
   aws.testMethod();
}

这是testMethod。

public void testMethod() {
        Instant start = Instant.now();
        try {
            try {
            long myValue = (long) ((Math.random())*10000);
            if(myValue>8000){
                myValue = myValue - 3000;
            }
            Thread.sleep(myValue);
            } catch(InterruptedException ex) {
                Thread.currentThread().interrupt();
            }
        } catch (Throwable t) {
            t.printStackTrace();
        } finally {
            Instant end = Instant.now();
            Duration nano = Duration.between(start, end);
            long endTime = nano.toMillis();
            createMetricData(endTime);
            createAnAlarm();
        }
    }

创建指标数据的方法

public void createMetricData(Long metricValue) {
        final AmazonCloudWatch cw = getAmazonCloudWatch();
        Dimension dimension = new Dimension().withName("UNIQUE_METHOD").withValue("testMethod");
        MetricDatum datum = new MetricDatum()
                .withMetricName("Method Execution Performance")
                .withUnit(StandardUnit.Milliseconds).withValue(metricValue.doubleValue())
                .withDimensions(dimension)
                .withTimestamp(new Date());
        PutMetricDataRequest metricDataRequest = new PutMetricDataRequest()
                .withNamespace("METHOD/TRAFFIC").withMetricData(datum);
        PutMetricDataResult response = cw.putMetricData(metricDataRequest);
        System.out.println(response);
        System.out.printf("Successfully put data point %f", metricValue.doubleValue());
    }

以下是创建警报的方法。

private void createAnAlarm(){
        final AmazonCloudWatch cw = getAmazonCloudWatch();
        PutMetricAlarmRequest putMetricAlarmRequest = new PutMetricAlarmRequest()
        .withPeriod(120)// The period, in seconds, over which the specified statistic is applied. Valid values are 10, 30, and any multiple of 60.
        .withMetricName("Method Execution Performance")// The name for // the metric // associated // with the // alarm.
        .withNamespace("METHOD/TRAFFIC")// The namespace for the metric // associated with the alarm. // https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/aws-namespaces.html
        .withAlarmName("aws-Method-Performance")// The name for the alarm. // This name must be unique // within the AWS account.
        .withEvaluationPeriods(1)// The number of periods over which // data is compared to the specified // threshold. An alarm's total // current evaluation period can be // no longer than one day, so this
                                    // number multiplied by period cannot be more than 86,400 seconds.
        .withActionsEnabled(true)// Indicates whether actions should be executed during any changes to
                                    // the alarm state.
        .withStatistic(Statistic.Average)// The statistic for the metric associated with the alarm, other than percentile
        .withThreshold(5.0)// The value against which the specified statistic is compared.
        .withComparisonOperator(ComparisonOperator.GreaterThanThreshold)
        .withAlarmDescription("Alarm when method execution time exceeds 5 milliseconds")
        .withAlarmActions("arn:aws:sns:eu-west-1:***********")//The actions to
        //execute when this alarm transitions to the ALARM state from
        // any other state. Each action is specified as an Amazon
        // Resource Name (ARN).
        .withUnit(StandardUnit.Milliseconds)// The unit of measure for
                                                // the statistic
        .withDimensions(new Dimension().withName("UNIQUE_METHOD").withValue("testMethod"));
        PutMetricAlarmResult result = cw.putMetricAlarm(putMetricAlarmRequest);
        System.out.println(result);
    }

我希望该方法的平均性能超过5毫秒。 这里有什么问题,如何解决?

1 个答案:

答案 0 :(得分:1)

所以问题是,如果只有一个数据点超过阈值,为什么警报会进入警报状态?那是因为你在警报创建中有这条线:

.withEvaluationPeriods(1)

此外,每次发布数据点时都会调用createAnAlarm();。没有必要这样做,您可以创建一次警报,它将继续监控您的指标。

正如评论中所讨论的,此警报触发的实际原因是阈值设置为5毫秒,但该方法的预期执行时间在秒范围内。在这种情况下设置的正确阈值是5秒:

.withThreshold(5000.0)