用于Tensorflow服务的Prometheus

时间:2018-10-24 17:58:26

标签: docker tensorflow prometheus tensorflow-serving

使用Prometheus导出程序进行Tensorflow服务的步骤是什么? 根据1.11,TF服务支持Prometheus指标: https://github.com/tensorflow/serving/releases/tag/1.11.0

我正在从示例https://www.tensorflow.org/serving/docker和以下示例中启动Docker:

docker run -p 8501:8501 -p 8500:8500 \ --mount type = bind,\ source = / tmp / tfserving / serving / tensorflow_serving / servables / tensorflow / testdata / saved_model_half_plus_two_cpu,\ target = / models / half_plus_two \ -e MODEL_NAME = half_plus_two -t张量流/服务&

Prometheus配置文件: 全球:   scrape_interval:10秒   Evaluation_interval:10秒   external_labels:     监控器:“ tf-serving-monitor”

scrape_configs:   -job_name:'tensorflow'     scrape_interval:5秒     static_configs:       -目标:['localhost:8501']

但是普罗米修斯无法找到tf服务所暴露的指标。 我应该在docker上打开一个特定的端口,还是应该传递给TF服务的某些参数?

1 个答案:

答案 0 :(得分:2)

According to the release notes you linked to TensorFlow在/monitoring/prometheus/metrics导出Prometheus指标(与Prometheus的默认/metrics相反)。因此,至少您需要在配置中添加metrics_path

scrape_configs:
  - job_name: 'tensorflow'
    scrape_interval: 5s
    metrics_path: '/monitoring/prometheus/metrics'
    static_configs:
      - targets: ['localhost:8501']

但是首先请确保您可以在浏览器中看到在http://localhost:8501/monitoring/prometheus/metrics处导出的指标。否则(使用浏览器)四处浏览,直到找到正确的URL(并将其反映在配置中)为止。