我有两个nameko服务,它们通过RabbitMQ使用RPC进行通信。在本地使用docker-compose可以正常工作。然后,将所有内容部署到DigitalOcean上的Kubernetes / Istio群集,并开始出现以下错误。它在10/20/60分钟内连续重复1次。服务之间的通信可以正常工作(我想在重新构造之前和之后),但是日志混乱了那些不应该发生的意外重新连接。
Helm RabbitMQ configuration file
我试图增加RAM和CPU配置(达到上面的配置文件中的值:512Mb和400m),但是仍然具有相同的行为。
注意:部署后我没有接触任何服务,没有发送任何消息或发出任何请求,并且在60分钟左右的时间内第一次出现此错误。最终我们以后在日志中仍然会出现此错误。
Nameko服务日志:
"Connection to broker lost, trying to re-establish connection...",
"exc_info": "Traceback (most recent call last):
File \"/usr/local/lib/python3.6/site-packages/kombu/mixins.py\", line 175, in run for _ in self.consume(limit=None, **kwargs):
File \"/usr/local/lib/python3.6/site-packages/kombu/mixins.py\", line 197, in consume conn.drain_events(timeout=safety_interval)
File \"/usr/local/lib/python3.6/site-packages/kombu/connection.py\", line 323, in drain_events
return self.transport.drain_events(self.connection, **kwargs)
File \"/usr/local/lib/python3.6/site-packages/kombu/transport/pyamqp.py\", line 103, in drain_events
return connection.drain_events(**kwargs)
File \"/usr/local/lib/python3.6/site-packages/amqp/connection.py\", line 505, in drain_events
while not self.blocking_read(timeout):
File \"/usr/local/lib/python3.6/site-packages/amqp/connection.py\", line 510, in blocking_read\n frame = self.transport.read_frame()
File \"/usr/local/lib/python3.6/site-packages/amqp/transport.py\", line 252, in read_frame
frame_header = read(7, True)
File \"/usr/local/lib/python3.6/site-packages/amqp/transport.py\", line 446, in _read
raise IOError('Server unexpectedly closed connection')
OSError: Server unexpectedly closed connection"}
{"name": "kombu.mixins", "asctime": "29/12/2019 20:22:54", "levelname": "INFO", "message": "Connected to amqp://user:**@rabbit-rabbitmq:5672//"}
RabbitMQ日志
2019-12-29 20:22:54.563 [warning] <0.718.0> closing AMQP connection <0.718.0> (127.0.0.1:46504 -> 127.0.0.1:5672, vhost: '/', user: 'user'):
client unexpectedly closed TCP connection
2019-12-29 20:22:54.563 [warning] <0.705.0> closing AMQP connection <0.705.0> (127.0.0.1:46502 -> 127.0.0.1:5672, vhost: '/', user: 'user'):
client unexpectedly closed TCP connection
2019-12-29 20:22:54.681 [info] <0.3424.0> accepting AMQP connection <0.3424.0> (127.0.0.1:43466 -> 127.0.0.1:5672)
2019-12-29 20:22:54.689 [info] <0.3424.0> connection <0.3424.0> (127.0.0.1:43466 -> 127.0.0.1:5672): user 'user' authenticated and granted access to vhost '/'
2019-12-29 20:22:54.690 [info] <0.3431.0> accepting AMQP connection <0.3431.0> (127.0.0.1:43468 -> 127.0.0.1:5672)
2019-12-29 20:22:54.696 [info] <0.3431.0> connection <0.3431.0> (127.0.0.1:43468 -> 127.0.0.1:5672): user 'user' authenticated and granted access to vhost '/'
答案 0 :(得分:3)
问题与istio代理一起被注入了Rabbitmq pod内的sidecar容器。您需要从Rabbitmq中排除istio代理,然后它才能工作。
答案 1 :(得分:0)
我认为这与this
有关尝试安装netstat
实用程序并运行它,以查看除ESTABLISHED
以外是否有太多连接
并尝试在您的设置中添加这些内容:
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_keepalive_time=30
net.ipv4.tcp_keepalive_intvl=10
net.ipv4.tcp_keepalive_probes=4
net.ipv4.tcp_tw_reuse = 1
请参阅this
答案 2 :(得分:0)
您是否尝试过增加连接的心跳?您的连接很可能由于不活动而在较低级别上终止。
还要确保您有足够的资源来运行主机上的所有容器。
我有类似的问题,我不确定以下哪一项为我解决了问题:
希望这可以带您到某个地方。