我想使用神经节来监控Hadoop(Hadoop版本-0.20.2)多节点集群。我的Hadoop工作正常。我在阅读了以下博客后安装了Ganglia ---
http://hakunamapdata.com/ganglia-configuration-for-a-small-hadoop-cluster-and-some-troubleshooting/
http://hokamblogs.blogspot.in/2013/06/ganglia-overview-and-installation-on.html
我还研究过使用Ganglia.pdf进行监控(附录B. Ganglia和Hadoop / HBase)。
I have modified only the following lines in **Hadoop-metrics.properties**(same on all Hadoop Nodes)==>
// Configuration of the "dfs" context for ganglia
dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext
dfs.period=10
dfs.servers=192.168.1.182:8649
// Configuration of the "mapred" context for ganglia
mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext
mapred.period=10
mapred.servers=192.168.1.182:8649:8649
// Configuration of the "jvm" context for ganglia
jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext
jvm.period=10
jvm.servers=192.168.1.182:8649
**gmetad.conf** (Only on Hadoop master Node )
data_source "Hadoop-slaves" 5 192.168.1.182:8649
RRAs "RRA:AVERAGE:0.5:1:302400" //Because i want to analyse one week data.
**gmond.conf** (on all the Hadoop Slave nodes and Hadoop Master)
globals {
daemonize = yes
setuid = yes
user = ganglia
debug_level = 0
max_udp_msg_len = 1472
mute = no
deaf = no
allow_extra_data = yes
host_dmax = 0 /*secs */
cleanup_threshold = 300 /*secs */
gexec = no
send_metadata_interval = 0
}
cluster {
name = "Hadoop-slaves"
owner = "Sandeep Priyank"
latlong = "unspecified"
url = "unspecified"
}
/* The host section describes attributes of the host, like the location */
host {
location = "CASL"
}
/* Feel free to specify as many udp_send_channels as you like. Gmond
used to only support having a single channel */
udp_send_channel {
host = 192.168.1.182
port = 8649
ttl = 1
}
/* You can specify as many udp_recv_channels as you like as well. */
udp_recv_channel {
port = 8649
}
/* You can specify as many tcp_accept_channels as you like to share
an xml description of the state of the cluster */
tcp_accept_channel {
port = 8649
}
现在Ganglia只为所有节点提供系统指标(mem,disk等)。但它没有显示Hadoop指标(如jvm,mapred指标) 等)在网络界面上。我该如何解决这个问题?
答案 0 :(得分:0)
我和Ganglia一起工作Hadoop,是的,我在Ganglia上看到了很多Hadoop指标(容器,地图任务,vmem)。事实上,Hadoop向Ganglio提交了更多的指标报告。
hokamblogs Post足够了。
我在主节点上编辑hadoop-metrics2.properties,内容为:
namenode.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31
namenode.sink.ganglia.period=10
namenode.sink.ganglia.servers=gmetad_hostname_or_ip:8649
resourcemanager.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31
resourcemanager.sink.ganglia.period=10
resourcemanager.sink.ganglia.servers=gmetad_hostname_or_ip:8649
我也在奴隶上编辑相同的文件:
datanode.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31
datanode.sink.ganglia.period=10
datanode.sink.ganglia.servers=gmetad_hostname_or_ip:8649
nodemanager.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31
nodemanager.sink.ganglia.period=10
nodemanager.sink.ganglia.servers=gmetad_hostname_or_ip:8649
你记得在更改文件后重新启动Hadoop和Ganglia。
我希望这对你有所帮助。
答案 1 :(得分:0)
感谢大家,如果您使用的是旧版本的Hadoop,请放入以下文件(来自新版本的Hadoop)==>
GangliaContext31.java
GangliaContext.java
在路径==>的hadoop / SRC /核心/组织/阿帕奇/ hadoop的/度量/神经节 从新版本的Hadoop。
使用ant编译您的Hadoop(并在编译时设置适当的代理)。 如果它给出了函数定义缺失的错误,那么将该函数定义(从新版本)放入适当的java文件中,然后再次编译Hadoop。它会起作用。