为Hadoop2配置Prometheus JMX导出器

时间:2017-11-05 12:21:18

标签: jmx hadoop2 prometheus

我正在尝试使用Prometheus JMX导出器来跟踪在ec2实例上运行的Hadoop2守护程序的指标:

  • hadoop namenode
  • hadoop datanode
  • yarn resourcemanager
  • yarn nodemanager

我正在尝试将JMX导出器作为包含所有四个守护进程的java代理运行。为此,我在hadoop-env.shyarn-env.sh添加了EXTRA_JAVA_OPTS:

export HADOOP_NAMENODE_OPTS="$HADOOP_NAMENODE_OPTS -javaagent:/home/ec2-user/jmx_exporter/jmx_prometheus_javaagent-0.10.jar=9102:/home/ec2-user/jmx_exporter/prometheus_config.yml"
export HADOOP_DATANODE_OPTS="$HADOOP_DATANODE_OPTS -javaagent:/home/ec2-user/jmx_exporter/jmx_prometheus_javaagent-0.10.jar=9102:/home/ec2-user/jmx_exporter/prometheus_config.yml"
export YARN_RESOURCEMANAGER_OPTS="$YARN_RESOURCEMANAGER_OPTS -javaagent:/home/ec2-user/jmx_exporter/jmx_prometheus_javaagent-0.10.jar=9102:/home/ec2-user/jmx_exporter/prometheus_config.yml"
export YARN_NODEMANAGER_OPTS="$YARN_NODEMANAGER_OPTS -javaagent:/home/ec2-user/jmx_exporter/jmx_prometheus_javaagent-0.10.jar=9102:/home/ec2-user/jmx_exporter/prometheus_config.yml"

示例prometheus_config.yml的资源管理器度量标准NumAllSources如下:

rules:
 - pattern: Hadoop<service=ResourceManager, name=MetricsSystem, sub=Stats><>NumAllSources
   name: sources
   labels:
    app_id: "hadoop_rm"

当我使用新配置和java_opts重新启动resourcemanager或其他守护进程时,我收到以下异常:

Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at sun.instrument.InstrumentationImpl.loadClassAndStartAgent(InstrumentationImpl.java:382)
at sun.instrument.InstrumentationImpl.loadClassAndCallPremain(InstrumentationImpl.java:397)
Caused by: java.lang.IllegalArgumentException: Collector already registered that provides name: jmx_scrape_duration_seconds
at io.prometheus.jmx.shaded.io.prometheus.client.CollectorRegistry.register(CollectorRegistry.java:54)
at io.prometheus.jmx.shaded.io.prometheus.client.Collector.register(Collector.java:128)

有任何建议如何解决这个问题?

3 个答案:

答案 0 :(得分:0)

这是因为当您致电-javaagent $HADOOP_OPTS/usr/local/hadoop/sbin/hadoop-daemon.sh start datanode最终会致电hadoop-daemon.sh以启动相关服务时,/usr/local/hadoop/bin/hdfs选项会在hadoop-config.sh中多次声明。< / p>

在此过程中,它会在shell脚本echo $HADOOP_OPTS中多次/usr/local/hadoop/bin/hdfs来源,如果您-javaagent,那么您会在那里找到多个HADOOP_OPTS=$HADOOP_OPTS -javaagent:...

解决方法是在/usr/local/hadoop/bin/hdfs中声明-javaagent,以确保HADOOP_OPTS

中只出现一个<rule name="Redirect rule pl for Redirects"> <match url=".*" /> <conditions> <add input="{Redirects:{REQUEST_URI}}" pattern="(.+)" /> <add input="{HTTP_HOST}" pattern="^preprod\.test-website\.pl$" /> </conditions> <action type="Redirect" url="{C:1}" appendQueryString="false" redirectType="Permanent" /> </rule>

答案 1 :(得分:0)

我认为这是因为您使用相同的端口(9102)进行所有注册,更改端口会有所帮助。

答案 2 :(得分:0)

虽然@chanhou的解决方案可以工作,但我想将编辑内容保留在hadoop-env.sh中,所以我选择了

if ! grep -q <<<"$HADOOP_NAMENODE_OPTS" jmx_prometheus_javaagent; then
        HADOOP_NAMENODE_OPTS="$HADOOP_NAMENODE_OPTS -javaagent:/home/caesarli/platform/jmx_prometheus_javaagent-0.12.0.jar=11099:/home/caesarli/platform/hadoop-2.8.4/etc/hadoop/jmx-name.yaml"
fi

,对于HADOOP_DATANODE_OPTS类似。