tcollector没有收集数据。 TSDB是空的

时间:2014-04-15 09:56:12

标签: java logging hbase monitoring opentsdb

我成功安装并启动并运行

http://54.72.4.157:4242/

我在我们的一台服务器上运行tcollector,我确实在startstop.sh中设置了主机

TSD_HOST=54.72.4.157

我做了

./startstop start

运行所有统计信息收集器。甚至在tsdb控制台日志中注意到了

[id: 0x5fc4bb31, /54.184.79.13:60203 => /172.31.14.125:4242] CONNECTED: /54.184.79.13:60203

在我的tcollector节点上,我做了,

ps axl | grep tcollector

我可以看到

0     0 16796 16795  20   0 183712  8000 poll_s Sl   ?          2:17 /usr/bin/python /home/mithralaya/tcollector/tcollector.py -c /home/mithralaya/tcollector/collectors -H 54.72.4.157 -t host=ip-172-31-12-203 -P /var/run/tcollector.pid
4 65534 16806 16796  20   0  39864  3748 poll_s Ss   ?          0:08 /usr/bin/python /home/mithralaya/tcollector/collectors/0/procstats.py
4 65534 16808 16796  39  19  39700  3380 poll_s SNs  ?          0:07 /usr/bin/python /home/mithralaya/tcollector/collectors/0/procnettcp.py
4 65534 16816 16796  20   0  39648  3240 poll_s Ss   ?          0:00 /usr/bin/python /home/mithralaya/tcollector/collectors/0/iostat.py
4 65534 16818 16796  20   0  39648  3400 poll_s Ss   ?          0:01 /usr/bin/python /home/mithralaya/tcollector/collectors/0/ifstat.py
4 65534 16822 16796  20   0  41848  3676 poll_s Ss   ?          0:05 /usr/bin/python /home/mithralaya/tcollector/collectors/0/netstat.py
4 65534 16824 16796  20   0  39648  3524 poll_s Ss   ?          0:00 /usr/bin/python /home/mithralaya/tcollector/collectors/0/dfstat.py
0     0 26617 26171  20   0   8108   940 pipe_w S+   pts/0      0:00 grep --color=auto tcollector

我无法在tcollector登录/ var / log / tcollector中看到任何重大错误。最新日志

2014-04-15 08:59:40,630 tcollector[16796] WARNING: haproxy.py: Error: HAProxy is not running
2014-04-15 08:59:55,090 tcollector[16796] INFO: removing redis-stats.py from the list of collectors (by request)
2014-04-15 08:59:55,091 tcollector[16796] INFO: removing nfsstat.py from the list of collectors (by request)
2014-04-15 08:59:55,091 tcollector[16796] WARNING: collector hbase_master.py terminated after 16 seconds with status code 1, marking dead
2014-04-15 08:59:55,091 tcollector[16796] INFO: removing udp_bridge.py from the list of collectors (by request)
2014-04-15 08:59:55,091 tcollector[16796] INFO: removing elasticsearch.py from the list of collectors (by request)
2014-04-15 08:59:55,092 tcollector[16796] INFO: removing zfsiostats.py from the list of collectors (by request)
2014-04-15 08:59:55,092 tcollector[16796] INFO: removing varnishstat.py from the list of collectors (by request)
2014-04-15 08:59:55,092 tcollector[16796] INFO: removing mongo.py from the list of collectors (by request)
2014-04-15 08:59:55,093 tcollector[16796] INFO: removing couchbase.py from the list of collectors (by request)
2014-04-15 08:59:55,093 tcollector[16796] INFO: removing graphite_bridge.py from the list of collectors (by request)
2014-04-15 08:59:55,093 tcollector[16796] INFO: removing zfskernstats.py from the list of collectors (by request)
2014-04-15 08:59:55,094 tcollector[16796] INFO: removing smart-stats.py from the list of collectors (by request)
2014-04-15 08:59:55,094 tcollector[16796] WARNING: collector mysql.py terminated after 16 seconds with status code 1, marking dead
2014-04-15 08:59:55,094 tcollector[16796] WARNING: collector hbase_regionserver.py terminated after 16 seconds with status code 1, marking dead
2014-04-15 08:59:55,095 tcollector[16796] INFO: removing postgresql.py from the list of collectors (by request)
2014-04-15 08:59:55,095 tcollector[16796] INFO: removing haproxy.py from the list of collectors (by request)
2014-04-15 08:59:55,095 tcollector[16796] INFO: removing riak.py from the list of collectors (by request)
2014-04-15 08:59:55,095 tcollector[16796] INFO: removing zookeeper.py from the list of collectors (by request)
2014-04-15 08:59:55,096 tcollector[16796] INFO: removing opentsdb.sh from the list of collectors (by request)
2014-04-15 09:09:40,651 tcollector[16796] INFO: Heartbeat (6 collectors running)
2014-04-15 09:19:41,217 tcollector[16796] INFO: Heartbeat (6 collectors running)
2014-04-15 09:29:41,794 tcollector[16796] INFO: Heartbeat (6 collectors running)
2014-04-15 09:39:43,586 tcollector[16796] INFO: Heartbeat (6 collectors running)

但没有收集任何统计数据。在hbase中,tsdb和tsdb-uid都是空的。

hbase(main):002:0> scan 'tsdb'
ROW                                                          COLUMN+CELL                                                                                                                                                                      
0 row(s) in 0.2890 seconds

hbase(main):003:0> 

你也可以在这里看到,

http://54.72.4.157:60010/

如果有人可以帮助我,我将不胜感激。

所有基于Hadoop的技术都很难安装和配置。我用了一个星期的时间来解决这个问题,我正在运行tcollector 24小时,而TSDB中仍然没有数据。

非常感谢,

KARTHIK

3 个答案:

答案 0 :(得分:0)

从日志文件输出中,似乎没有任何tcollector插件实际运行。由于错误,它们会被催生并立即被删除。

答案 1 :(得分:0)

好吧,可能有几个收集器像procstats.py一样运行(收集基本指标,如cpu,内存等),我注意到它不在错误日志中。

您可能无法获取数据到您的hbase,因为您的opentsdb配置设置为您需要手动创建指标的默认值。如果是这样,那么您必须自己定义指标。

相反,要创建自动创建的指标,请尝试转到您的opentsdb服务器并检查配置并将指标创建设置为自动。

具体来说,在/etc/opentsdb/opentsdb.conf中,设置参数" tsd.core.auto_create_metrics"为true,然后重新启动opentsdb服务。

然后再次检查您的hbase,看看您是否可以在“< tsdb-uid'例如。

答案 2 :(得分:0)

请尝试将conf文件设置为自动创建指标

# --------- CORE ----------
# Whether or not to automatically create UIDs for new metric types, default
# is False
tsd.core.auto_create_metrics = true