如何解决 websocket ping 超时问题

时间:2021-05-27 23:31:16

标签: nginx kubernetes websocket django-channels daphne

我有 Daphne 提供的 Django 通道(带有 Redis),在 Nginx 入口控制器后面运行,在 LB 后面代理,所有设置都在 Kubernetes 中。 Websocket 已升级,一切正常……几分钟。在 5-15 分钟(不同)之后,我的 daphne 日志(在 -v 2 中设置以进行调试)显示:

WARNING dropping connection to peer tcp4:10.2.0.163:43320 with abort=True: WebSocket ping timeout (peer did not respond with pong in time)

10.2.0.163 是我的 Nginx pod 的集群 IP 地址。紧接着,Nginx 记录以下内容:

[error] 39#39: *18644 recv() failed (104: Connection reset by peer) while proxying upgraded connection [... + client real IP]

此后,websocket 连接变得奇怪:客户端仍然可以向后端发送消息,但是 Django 频道中的同一个 websocket 连接不再接收群组消息,就好像频道已取消订阅群组一样。我知道我的代码可以正常工作,因为在记录错误之前一切都运行顺利,但我猜测某处存在导致问题的配置错误。可悲的是,我完全没有想法。这是我的 nginx 入口:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    kubernetes.io/ingress.class: "nginx"
    cert-manager.io/cluster-issuer: "letsencrypt-prod"
    acme.cert-manager.io/http01-edit-in-place: "true"
    nginx.ingress.kubernetes.io/proxy-read-timeout: "3600"
    nginx.ingress.kubernetes.io/proxy-send-timeout: "3600"
    nginx.org/websocket-services: "daphne-svc"
  name: ingress
  namespace: default
spec:
  tls:
  - hosts:
    - mydomain
    secretName: letsencrypt-secret
  rules:
    - host: mydomain
      http:
        paths:
          - path: /
            backend:
              service:
                name: uwsgi-svc
                port:
                  number: 80            
            pathType: Prefix
          - path: /ws
            backend:
              service:
                name: daphne-svc
                port:
                  number: 80            
            pathType: Prefix 

根据thisthis配置。使用 helm 安装:

helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update
helm install ngingress ingress-nginx/ingress-nginx

这是我的 Django 频道消费者:

class ChatConsumer(AsyncWebsocketConsumer):

 
    async def connect(self):
        user = self.scope['user']
        if user.is_authenticated: 
            self.inbox_group_name = "inbox-%s" % user.id


            device = self.scope.get('device', None)
            added = False
            if device:
                added = await register_active_device(user, device)
                
            if added:
                # Join inbox group
                await self.channel_layer.group_add(
                    self.inbox_group_name,
                    self.channel_name
                )

                
                await self.accept()
            else:
                await self.close()
        else:
            await self.close()

    
    async def disconnect(self, close_code):
        user = self.scope['user']
        device = self.scope.get('device', None)
        if device:
            await unregister_active_device(user, device)
        # Leave room group
        if hasattr(self, 'inbox_group_name'):
            await self.channel_layer.group_discard(
                self.inbox_group_name,
                self.channel_name
            )
            
        

    """
    Receive message from room group; forward it to client
    """
    async def group_message(self, event):
        message = event['message']
        
        # Send message to WebSocket 
        await self.send(text_data=json.dumps(message))
        

    async def forward_message_to_other_members(self, chat, message, notification_fallback=False):    

        user = self.scope['user']
        other_members = await get_other_chat_members(chat, user)                
        for member in other_members:
            if member.active_devices_count > 0:
                #this will send the message to the user inbox; each consumer will handle it with the group_message method
                await self.channel_layer.group_send(
                    member.inbox.group_name,
                    {
                        'type': 'group_message',
                        'message': message
                    }
                )
            else:
                #no connection for this user, send a notification instead
                if notification_fallback:
                    await ChatNotificationHandler().send_chat_notification(chat, message, recipient=member, author=user)  

1 个答案:

答案 0 :(得分:1)

我最终在客户端上添加了一个内部 ping 并将 nginx 超时增加到 1 天,这改变了问题,但也表明它可能不是 nginx/daphne 配置问题。