我们有ambari集群,HDP版本– 2.6.4
和ambari版本– 2.6.1
集群中所有计算机上的ambari代理都存在严重问题!
例如,让我们承担计算机–数据节点计算机
在这台机器上,我们多次松动心跳(其平均代理已停止与ambari服务器通信),因此,作为解决方法,我们重新启动了ambari代理,并解决了此问题, 但是它一次又一次地返回
有时候,我们从ambari代理日志中看到正在使用的端口,因此我们杀死了PID并重新启动ambari代理(我们从netstat和端口号捕获PID) 但是这种行为会返回很多次
让我们总结一下情况–看来这可能是网络问题或某些中断,这些中断或中断了ambari代理与ambari服务器之间的连接,但我们没有解决方案,我们不确定这一点
请告知如何找到根本原因?
从日志中,我们可以看到以下内容:
INFO 2019-11-11 20:07:08,518 DataCleaner.py:122 - Data cleanup finished
ERROR 2019-11-11 20:07:08,678 main.py:407 - Failed to start ping port listener of: Could not open port 8670 because port already used by another process:
UID PID PPID C STIME TTY TIME CMD
root 1229 1221 5 Nov04 ? 08:34:51 /usr/bin/python /usr/lib/python2
INFO 2019-11-11 20:07:08,679 PingPortListener.py:61 - Ping port listener killed
INFO 2019-11-11 20:07:08,679 ExitHelper.py:56 - Performing cleanup before exiting...
ambari-agent]# grep "Ping port listener killed" ambari-agent.log
INFO 2019-11-10 09:54:31,717 PingPortListener.py:61 - Ping port listener killed
INFO 2019-11-10 09:54:33,177 PingPortListener.py:61 - Ping port listener killed
INFO 2019-11-10 17:38:30,591 PingPortListener.py:61 - Ping port listener killed
INFO 2019-11-10 17:38:31,846 PingPortListener.py:61 - Ping port listener killed
INFO 2019-11-10 19:03:35,082 PingPortListener.py:61 - Ping port listener killed
INFO 2019-11-10 19:03:36,691 PingPortListener.py:61 - Ping port listener killed
INFO 2019-11-10 19:04:21,926 PingPortListener.py:61 - Ping port listener killed
INFO 2019-11-10 19:04:23,449 PingPortListener.py:61 - Ping port listener killed
INFO 2019-11-10 19:33:31,859 PingPortListener.py:61 - Ping port listener killed
INFO 2019-11-10 19:33:33,234 PingPortListener.py:61 - Ping port listener killed
INFO 2019-11-11 03:34:43,391 PingPortListener.py:61 - Ping port listener killed
INFO 2019-11-11 03:34:44,927 PingPortListener.py:61 - Ping port listener killed
INFO 2019-11-11 03:40:30,522 PingPortListener.py:61 - Ping port listener killed
INFO 2019-11-11 03:40:31,893 PingPortListener.py:61 - Ping port listener killed
INFO 2019-11-11 09:10:31,346 PingPortListener.py:61 - Ping port listener killed
INFO 2019-11-11 09:10:33,092 PingPortListener.py:61 - Ping port listener killed
INFO 2019-11-11 09:40:31,171 PingPortListener.py:61 - Ping port listener killed
INFO 2019-11-11 09:40:32,410 PingPortListener.py:61 - Ping port listener killed
INFO 2019-11-11 16:09:09,491 PingPortListener.py:61 - Ping port listener killed
INFO 2019-11-11 16:09:11,127 PingPortListener.py:61 - Ping port listener killed
INFO 2019-11-11 19:06:34,818 PingPortListener.py:61 - Ping port listener killed
INFO 2019-11-11 19:06:36,500 PingPortListener.py:61 - Ping port listener killed
INFO 2019-11-11 20:07:07,720 PingPortListener.py:61 - Ping port listener killed
INFO 2019-11-11 20:07:08,679 PingPortListener.py:61 - Ping port listener killed