使用正则表达式从Grafana表达式中检索Prometheus度量标准名称

时间:2018-08-04 16:05:04

标签: java regex grafana prometheus regex-group

我已经尝试了许多不同的regex模式来实现它,但是并不十分成功。

此问题的模式:

<method_name(> metric_name <{filter_condition}> <[time_duration]> <)> <by (some members)>
            ^------------------------------------------------------^
                          method_name(...) can be multiple

如您所见,<...>是可选的,而metric_name是必不可少的,我想从此equation中检索。

Case # 1
input: sum(log_search_by_service_total {service_name!~\"\"}) by (service_name, operator)
output: log_search_by_service_total

Case # 2
input: log_request_total
output: log_request_total

Case # 3
input:  sum(delta(log_request_total[5m])) by (args, user_id)
output: log_request_total

Case # 4
input: log_request_total{methodName=~\"getAppDynamicsGraphMetrics|getAppDynamicsMetrics\"}
output: log_request_total

Case # 5
input: sum(delta(log_request_total{className=~\".*ProductDashboardController\",methodName=~\"getDashboardConfig|updateMaintainers|addQuickLink|deleteQuickLink|addDependentMiddleware|addDependentService|updateErrorThreshold\"}[5m])) by (user_id)"
output: log_request_total

Case # 6
input: count_scalar(sum(log_query_request_total) by (user_id))
output: log_query_request_total

这是我在Java中尝试过的演示。但是似乎我无法获得正确的pattern来检索我上面提到的模式的确切答案。

请尽可能分享一些想法。

public static void main(String... args) {
    String[] exprs = {"sum(log_query_task_cache_hit_rate_bucket)by(le)",
            "sum(log_search_by_service_total {service_name!~\"\"}) by (service_name, operator)",
            "log_request_total",
            " sum(delta(log_request_total[5m])) by (args, user_id)",
            "log_request_total{methodName=~\"getAppDynamicsGraphMetrics|getAppDynamicsMetrics\"}",
            "sum(delta(log_request_total{className=~\".*ProductDashboardController\",methodName=~\"getDashboardConfig|updateMaintainers|addQuickLink|deleteQuickLink|addDependentMiddleware|addDependentService|updateErrorThreshold\"}[5m])) by (user_id)",
            "sum(log_request_total{methodName=\"getInstanceNames\"}) by (user_id)",
            "sum(log_request_total{methodName=\"getVpcCardInfo\",user_id!~\"${user}\"}) by (envName)",
            "count_scalar(sum(log_query_request_total) by (user_id))",
            "avg(log_waiting_time_average) by (exported_tenant, exported_landscape)",
            "avg(task_processing_time_average{app=\"athena\"})",
            "avg(log_queue_time_average) by (log_type)",
            "sum(delta(product_dashboard_service_sum[2m]))",
            "ceil(delta(product_dashboard_service_count[5m]))]"
    };
    String[] expected = {
            "log_query_task_cache_hit_rate_bucket",
            "log_search_by_service_total",
            "log_request_total",
            "log_request_total",
            "log_request_total",
            "log_request_total",
            "log_request_total",
            "log_request_total",
            "log_query_request_total",
            "log_waiting_time_average",
            "task_processing_time_average",
            "log_queue_time_average",
            "product_dashboard_service_sum",
            "product_dashboard_service_count"
    };
    Pattern pattern = Pattern.compile(".*?\\(?([\\w|_]+)\\{?\\[?.*");
    testPattern(exprs, expected, pattern);
    pattern = Pattern.compile(".*\\(?([\\w|_]+)\\{?\\[?.*");
    testPattern(exprs, expected, pattern);
    pattern = Pattern.compile(".*?\\(?([\\w|_]+)\\{?\\[?.*");
    testPattern(exprs, expected, pattern);
}

private static void testPattern(String[] exprs, String[] expected, Pattern pattern) {
    System.out.println("\n********** Pattern Match Test *********\n");
    for (int i = 0; i < exprs.length; ++i) {
        String expr = exprs[i];
        Matcher matcher = pattern.matcher(expr);
        if (matcher.find()) {
            System.out.println("\nThe Original Expr: " + expr);
            System.out.println(String.format("Expected:\t %-40s Matched:\t %-40s", expected[i], matcher.group(1)));
        } else {
            System.out.println("expected: " + expected[i] + " not matched");
        }
    }
}

更新2018-08-06

感谢波西米亚人的帮助,这确实使我受益匪浅(因为我一直相信regex可以用干净的解决方案来做魔术)。

后来,我发现expr比我预期的要复杂,如下所示:

Case # 7
input: topk(10,autoindex_online_consume_time_total_sum{app=~"$app", DTO_Name=~"$c_class"})
expected: autoindex_online_consume_time_total_sum
// to get the metric name: autoindex_online_consume_time_total_sum
// still I can make it work with small modifications as ^(?:\w+\()*(?:\d+,)*(\w+)

但是以下一种甚至更多种不同的复杂组合使我转向可靠的方法:

Case # 8
input: sum(hue_mail_sent_attachment_bytes_total) by (app)  / sum(hue_mail_sent_mails_with_attachment_total) by (app)
Expected: [hue_mail_sent_attachment_bytes_total, hue_mail_sent_mails_with_attachment_total]

现在更加复杂...甚至不可预测,因为无法控制用户的expr输入。

因此,我通过更可靠,更简单的解决方案实现了相同的目标:

  1. 首先将distinct度量标准名称存储到数据库中;
  2. expr出现时,使用contains(String s)在内存中对其进行检查;
  3. 仍然可能存在问题:如果某些指标名称包含其他指标,则匹配过度;

1 个答案:

答案 0 :(得分:2)

此正则表达式可捕获您在第1组中的目标

^(?:\w+\()*(\w+)

请参见live demo

在Java中,获取目标:

String metricName = input.replaceAll("^(?:\\w+\\()*(\\w+)", "$1");