分片的Mongodb随机停滞

时间:2019-08-05 05:48:11

标签: mongodb azure kubernetes sharding

我在kuberenetes中使用哈希分片设置了Sharded MongoDB集群。我首先创建了配置服务器Replicaset,然后创建了2个分片副本集。最终创建了mongos以连接到分片集群。

我点击了以下链接来设置分片的MongoDB 点击https://docs.mongodb.com/manual/tutorial/deploy-sharded-cluster-hashed-sharding/

创建mongos之后,我为数据库启用了分片,并使用哈希分片策略对集合进行了分片。

完成所有这些设置之后,我就可以连接到mongos并将一些数据添加到数据库中的某些集合中,并能够检查数据在不同分片中的分布。

我面临的问题是,当尝试从java spring boot项目访问mongodb时,连接随机停止。但是,一旦为特定查询建立了连接,该特定查询就不会在接下来的几次尝试中停止闲置一段时间后,如果我尝试再次向mongodb发出请求,它将再次停止。

注意:MongoDB托管在“ DS2 v2” VM中,该群集具有4个节点。1个用于配置服务器,2个用于分片,1个用于mongos

  • 在其中一个链接中,他们要求为所有集合设置适当的分片密钥,这将对mongodb的性能产生影响。选择正确的分片密钥之前,需要考虑以下几点。 ,我在选择分片键之前已经考虑了所有这些因素。我通读了此链接以选择分片键-单击https://www.mongodb.com/blog/post/on-selecting-a-shard-key-for-mongodb

  • 我遇到的另一个解决方案是设置ShardingTaskExecutorPoolMaxConnecting并限制mongos节点将连接子添加到连接池的速率。我尝试将其设置为20、5、100、150,但没有一个解决了我面临的拖延问题。 这是链接-单击https://jira.mongodb.org/browse/SERVER-29237

  • 我尝试调整其他参数,例如ShardingTaskExecutorPoolMinSize和taskExecutorPoolSize。即使这样也无法解决停顿问题。

  • 我还将--serviceExecutor设置为自适应。

  • wiredTigerCacheSizeGB从0.25增加到2,这也对停滞问题没有任何影响

1)mongodb配置服务器的服务和部署的YAML文件是-

apiVersion: v1
items:
- apiVersion: v1
  kind: Service
  metadata:
    annotations:
      kompose.cmd: kompose convert -d -f docker-compose.yml -o azure-deployment.yaml
      kompose.version: 1.12.0 (0ab07be)
    creationTimestamp: null
    labels:
      io.kompose.service: mongo-conf-service
    name: mongo-conf-service
  spec:
    type: LoadBalancer
    ports:
    - name: "27017"
      port: 27017
      targetPort: 27017
    selector:
      io.kompose.service: mongo-conf-service
  status:
    loadBalancer: {}
- apiVersion: extensions/v1beta1
  kind: Deployment
  metadata:
    annotations:
      kompose.cmd: kompose convert -d -f docker-compose.yml -o azure-deployment.yaml
      kompose.version: 1.12.0 (0ab07be)
    creationTimestamp: null
    labels:
      io.kompose.service: mongo-conf-service
    name: mongo-conf-service
  spec:
    replicas: 1
    strategy: {}
    template:
      metadata:
        creationTimestamp: null
        labels:
          io.kompose.service: mongo-conf-service
      spec:
        containers:
        - env:
          - name: MONGO_INITDB_ROOT_USERNAME
            value: #Username
          - name: MONGO_INITDB_ROOT_PASSWORD
            value: #Password
          command:
          - "mongod"
          - "--storageEngine"
          - "wiredTiger"
          - "--port"
          - "27017"
          - "--bind_ip"
          - "0.0.0.0"
          - "--wiredTigerCacheSizeGB"
          - "2"
          - "--configsvr"
          - "--replSet"
          - "ConfigDBRepSet"
          image: #MongoImageName
          name: mongo-conf-service
          ports:
          - containerPort: 27017
          resources: {}
          volumeMounts:
          - name: mongo-conf
            mountPath: /data/db
        restartPolicy: Always
        volumes:
          - name: mongo-conf
            persistentVolumeClaim:
              claimName: mongo-conf

2)Shard mongodb的服务和部署的YAML文件是-

apiVersion: v1
items:
- apiVersion: v1
  kind: Service
  metadata:
    annotations:
      kompose.cmd: kompose convert -d -f docker-compose.yml -o azure-deployment.yaml
      kompose.version: 1.12.0 (0ab07be)
    creationTimestamp: null
    labels:
      io.kompose.service: mongo-shard
    name: mongo-shard
  spec:
    type: LoadBalancer
    ports:
    - name: "27017"
      port: 27017
      targetPort: 27017
    selector:
      io.kompose.service: mongo-shard
  status:
    loadBalancer: {}
- apiVersion: extensions/v1beta1
  kind: Deployment
  metadata:
    annotations:
      kompose.cmd: kompose convert -d -f docker-compose.yml -o azure-deployment.yaml
      kompose.version: 1.12.0 (0ab07be)
    creationTimestamp: null
    labels:
      io.kompose.service: mongo-shard
    name: mongo-shard
  spec:
    replicas: 1
    strategy: {}
    template:
      metadata:
        creationTimestamp: null
        labels:
          io.kompose.service: mongo-shard
      spec:
        containers:
        - env:
          - name: MONGO_INITDB_ROOT_USERNAME
            value: #Username
          - name: MONGO_INITDB_ROOT_PASSWORD
            value: #Password
          command:
          - "mongod"
          - "--storageEngine"
          - "wiredTiger"
          - "--port"
          - "27017"
          - "--bind_ip"
          - "0.0.0.0"
          - "--wiredTigerCacheSizeGB"
          - "2"
          - "--shardsvr"
          - "--replSet"
          - "Shard1RepSet"
          image: #MongoImage
          name: mongo-shard
          ports:
          - containerPort: 27017
          resources: {}

3)mongos服务器的YAML文件:

apiVersion: v1
items:
- apiVersion: v1
  kind: Service
  metadata:
    annotations:
      kompose.cmd: kompose convert -d -f docker-compose.yml -o azure-deployment.yaml
      kompose.version: 1.12.0 (0ab07be)
    creationTimestamp: null
    labels:
      io.kompose.service: mongos-service
    name: mongos-service
  spec:
    type: LoadBalancer
    ports:
    - name: "27017"
      port: 27017
      targetPort: 27017
    selector:
      io.kompose.service: mongos-service
  status:
    loadBalancer: {}
- apiVersion: extensions/v1beta1
  kind: Deployment
  metadata:
    annotations:
      kompose.cmd: kompose convert -d -f docker-compose.yml -o azure-deployment.yaml
      kompose.version: 1.12.0 (0ab07be)
    creationTimestamp: null
    labels:
      io.kompose.service: mongos-service
    name: mongos-service
  spec:
    replicas: 1
    strategy: {}
    template:
      metadata:
        creationTimestamp: null
        labels:
          io.kompose.service: mongos-service
      spec:
        containers:
        - env:
          - name: MONGO_INITDB_ROOT_USERNAME
            value: #USername
          - name: MONGO_INITDB_ROOT_PASSWORD
            value: #Password
          command:
            - "numactl"
            - "--interleave=all"
            - "mongos"
            - "--port"
            - "27017"
            - "--bind_ip"
            - "0.0.0.0"
            - "--configdb"
            - "ConfigDBRepSet/mongo-conf-service:27017"
          image: #MongoImageName
          name: mongos-service
          ports:
          - containerPort: 27017
          resources: {}
  • mongos服务器的日志为:
2019-08-05T05:27:52.942+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:5058 #308807 (79 connections now open)
2019-08-05T05:27:52.964+0000 I ACCESS   [conn308807] Successfully authenticated as principal Assist_Random_Workspace on Random_Workspace from client 10.0.0.0:5058
2019-08-05T05:27:54.267+0000 I NETWORK  [worker-3] end connection 10.0.0.0:52954 (78 connections now open)
2019-08-05T05:27:54.269+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:52988 #308808 (79 connections now open)
2019-08-05T05:27:54.275+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:7174 #308809 (80 connections now open)
2019-08-05T05:27:54.279+0000 I ACCESS   [conn308809] SASL SCRAM-SHA-1 authentication failed for Assist_Refactored_Code_DB on Refactored_Code_DB from client 10.0.0.:7174 ; UserNotFound: User "Assist_Refactored_Code_DB@Refactored_Code_DB" not found
2019-08-05T05:27:54.281+0000 I NETWORK  [worker-1] end connection 10.0.0.5:7174 (79 connections now open)
2019-08-05T05:27:54.342+0000 I NETWORK  [worker-1] end connection 10.0.0.6:57391 (78 connections now open)
2019-08-05T05:27:54.343+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:57527 #308810 (79 connections now open)
2019-08-05T05:27:55.080+0000 I NETWORK  [worker-3] end connection 10.0.0.0:56021 (78 connections now open)
2019-08-05T05:27:55.081+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:56057 #308811 (79 connections now open)
2019-08-05T05:27:56.054+0000 I NETWORK  [worker-1] end connection 10.0.0.0:59137 (78 connections now open)
2019-08-05T05:27:56.055+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:59184 #308812 (79 connections now open)
2019-08-05T05:27:59.268+0000 I NETWORK  [worker-1] end connection 10.0.0.5:52988 (78 connections now open)
2019-08-05T05:27:59.270+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:53047 #308813 (79 connections now open)
2019-08-05T05:27:59.343+0000 I NETWORK  [worker-3] end connection 10.0.0.6:57527 (78 connections now open)
2019-08-05T05:27:59.344+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:57672 #308814 (79 connections now open)
2019-08-05T05:28:00.080+0000 I NETWORK  [worker-3] end connection 10.0.1.1:56057 (78 connections now open)
2019-08-05T05:28:00.081+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:56116 #308815 (79 connections now open)
2019-08-05T05:28:01.054+0000 I NETWORK  [worker-3] end connection 10.0.0.0:59184 (78 connections now open)
2019-08-05T05:28:01.058+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:59225 #308816 (79 connections now open)
2019-08-05T05:28:01.763+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:7173 #308817 (80 connections now open)
2019-08-05T05:28:01.768+0000 I ACCESS   [conn308817] SASL SCRAM-SHA-1 authentication failed for Assist_Sharded_Database on Sharded_Database from client 10.0.0.0:7173 ; UserNotFound: User "Assist_Sharded_Database@Sharded_Database" not found
2019-08-05T05:28:01.770+0000 I NETWORK  [worker-3] end connection 10.0.0.0:7173 (79 connections now open)
2019-08-05T05:28:04.271+0000 I NETWORK  [worker-3] end connection 10.0.0.0:53047 (78 connections now open)
2019-08-05T05:28:04.272+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:53083 #308818 (79 connections now open)
2019-08-05T05:28:04.283+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:7105 #308819 (80 connections now open)
2019-08-05T05:28:04.287+0000 I ACCESS   [conn308819] SASL SCRAM-SHA-1 authentication failed for Assist_Refactored_Code_DB on Refactored_Code_DB from client 10.0.0.0:7105 ; UserNotFound: User "Assist_Refactored_Code_DB@Refactored_Code_DB" not found

在上面的日志中,对Assist_Refactored_Code_DB的身份验证存在错误(该数据库不是由我创建的)。我不确定此身份验证失败的原因以及应该在哪个mongo URI中提及用户名和密码。不知道这是否是停顿的原因之一。 这是我在mongos中可以找到的唯一错误日志。配置服务器和shard mongo中的所有其他日志都没有任何错误。

Shard1Repset的日志是:

019-08-06T10:48:08.926+0000 I NETWORK  [listener] connection accepted from 10.0.0.4:58010 #782186 (10 connections now open)
2019-08-06T10:48:11.585+0000 I NETWORK  [conn782183] end connection 10.0.0.0:64938 (9 connections now open)
2019-08-06T10:48:11.586+0000 I NETWORK  [listener] connection accepted from 10.0.0.7:64989 #782187 (10 connections now open)
2019-08-06T10:48:11.765+0000 I NETWORK  [conn782184] end connection 10.0.0.0:62126 (9 connections now open)
2019-08-06T10:48:11.766+0000 I NETWORK  [listener] connection accepted from 10.0.0.6:62302 #782188 (10 connections now open)
2019-08-06T10:48:13.763+0000 I NETWORK  [conn782185] end connection 10.0.0.0:52907 (9 connections now open)
2019-08-06T10:48:13.763+0000 I NETWORK  [listener] connection accepted from 10.0.0.1:52947 #782189 (10 connections now open)
2019-08-06T10:48:13.926+0000 I NETWORK  [conn782186] end connection 10.0.0.0:58010 (9 connections now open)
2019-08-06T10:48:13.927+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:58051 #782190 (10 connections now open)
2019-08-06T10:48:16.586+0000 I NETWORK  [conn782187] end connection 10.0.0.0:64989 (9 connections now open)
2019-08-06T10:48:16.587+0000 I NETWORK  [listener] connection accepted from 10.0.0.0:65054 #782191 (10 connections now open)
2019-08-06T10:48:16.766+0000 I NETWORK  [conn782188] end connection 10.0.0.6:62302 (9 connections now open)
2019-08-06T10:48:16.767+0000 I NETWORK  [listener] connection accepted from 10.0.0.6:62445 #782192 (10 connections now open)
2019-08-06T10:48:18.765+0000 I NETWORK  [conn782189] end connection 10.0.2.1:52947 (9 connections now open)
2019-08-06T10:48:18.765+0000 I NETWORK  [listener] connection accepted from 10.0.2.1:52989 #782193 (10 connections now open)
2019-08-06T10:48:18.927+0000 I NETWORK  [conn782190] end connection 10.0.0.4:58051 (9 connections now open)
2019-08-06T10:48:18.929+0000 I NETWORK  [listener] connection accepted from 10.0.0.4:58100 #782194 (10 connections now open)
2019-08-06T10:48:21.588+0000 I NETWORK  [conn782191] end connection 10.0.0.7:65054 (9 connections now open)
2019-08-06T10:48:21.589+0000 I NETWORK  [listener] connection accepted from 10.0.0.7:65105 #782195 (10 connections now open)
2019-08-06T10:48:21.767+0000 I NETWORK  [conn782192] end connection 10.0.0.6:62445 (9 connections now open)
2019-08-06T10:48:21.768+0000 I NETWORK  [listener] connection accepted from 10.0.0.6:62581 #782196 (10 connections now open)
2019-08-06T10:48:23.766+0000 I NETWORK  [conn782193] end connection 10.0.2.1:52989 (9 connections now open)
2019-08-06T10:48:23.766+0000 I NETWORK  [listener] connection accepted from 10.0.2.1:53030 #782197 (10 connections now open)
2019-08-06T10:48:23.928+0000 I NETWORK  [conn782194] end connection 10.0.0.4:58100 (9 connections now open)
2019-08-06T10:48:23.930+0000 I NETWORK  [listener] connection accepted from 10.0.0.4:58145 #782198 (10 connections now open)
2019-08-06T10:48:26.589+0000 I NETWORK  [conn782195] end connection 10.0.0.7:65105 (9 connections now open)
2019-08-06T10:48:26.590+0000 I NETWORK  [listener] connection accepted from 10.0.0.7:65148 #782199 (10 connections now open)
2019-08-06T10:48:26.768+0000 I NETWORK  [conn782196] end connection 10.0.0.6:62581 (9 connections now open)
2019-08-06T10:48:26.770+0000 I NETWORK  [listener] connection accepted from 10.0.0.6:62746 #782200 (10 connections now open)
2019-08-06T10:48:28.766+0000 I NETWORK  [conn782197] end connection 10.0.2.1:53030 (9 connections now open)
2019-08-06T10:48:28.767+0000 I NETWORK  [listener] connection accepted from 10.0.2.1:53081 #782201 (10 connections now open)
2019-08-06T10:48:28.930+0000 I NETWORK  [conn782198] end connection 10.0.0.4:58145 (9 connections now open)
2019-08-06T10:48:28.931+0000 I NETWORK  [listener] connection accepted from 10.0.0.4:58217 #782202 (10 connections now open)
2019-08-06T10:48:31.590+0000 I NETWORK  [conn782199] end connection 10.0.0.7:65148 (9 connections now open)

ConfigDBRepSet的日志为:

2019-08-06T10:52:18.962+0000 I NETWORK  [conn781553] end connection 10.0.0.4:60257 (10 connections now open)
2019-08-06T10:52:18.963+0000 I NETWORK  [listener] connection accepted from 10.0.0.4:60306 #781557 (11 connections now open)
2019-08-06T10:52:21.296+0000 I NETWORK  [conn781554] end connection 10.0.0.7:50910 (10 connections now open)
2019-08-06T10:52:21.297+0000 I NETWORK  [listener] connection accepted from 10.0.0.7:50956 #781558 (11 connections now open)
2019-08-06T10:52:22.380+0000 I NETWORK  [conn781555] end connection 10.0.0.5:54999 (10 connections now open)
2019-08-06T10:52:22.381+0000 I NETWORK  [listener] connection accepted from 10.0.0.5:55043 #781559 (11 connections now open)
2019-08-06T10:52:22.554+0000 I NETWORK  [conn781556] end connection 10.0.3.1:57125 (10 connections now open)
2019-08-06T10:52:22.555+0000 I NETWORK  [listener] connection accepted from 10.0.3.1:57258 #781560 (11 connections now open)
2019-08-06T10:52:23.963+0000 I NETWORK  [conn781557] end connection 10.0.0.4:60306 (10 connections now open)
2019-08-06T10:52:23.964+0000 I NETWORK  [listener] connection accepted from 10.0.0.4:60341 #781561 (11 connections now open)
2019-08-06T10:52:26.298+0000 I NETWORK  [conn781558] end connection 10.0.0.7:50956 (10 connections now open)
2019-08-06T10:52:26.299+0000 I NETWORK  [listener] connection accepted from 10.0.0.7:50998 #781562 (11 connections now open)
2019-08-06T10:52:27.382+0000 I NETWORK  [conn781559] end connection 10.0.0.5:55043 (10 connections now open)
2019-08-06T10:52:27.383+0000 I NETWORK  [listener] connection accepted from 10.0.0.5:55086 #781563 (11 connections now open)
2019-08-06T10:52:27.555+0000 I NETWORK  [conn781560] end connection 10.0.3.1:57258 (10 connections now open)
2019-08-06T10:52:27.556+0000 I NETWORK  [listener] connection accepted from 10.0.3.1:57415 #781564 (11 connections now open)
2019-08-06T10:52:28.964+0000 I NETWORK  [conn781561] end connection 10.0.0.4:60341 (10 connections now open)
2019-08-06T10:52:28.965+0000 I NETWORK  [listener] connection accepted from 10.0.0.4:60406 #781565 (11 connections now open)
2019-08-06T10:52:31.299+0000 I NETWORK  [conn781562] end connection 10.0.0.7:50998 (10 connections now open)
2019-08-06T10:52:31.300+0000 I NETWORK  [listener] connection accepted from 10.0.0.7:51043 #781566 (11 connections now open)
2019-08-06T10:52:32.383+0000 I NETWORK  [conn781563] end connection 10.0.0.5:55086 (10 connections now open)
2019-08-06T10:52:32.384+0000 I NETWORK  [listener] connection accepted from 10.0.0.5:55136 #781567 (11 connections now open)
2019-08-06T10:52:32.556+0000 I NETWORK  [conn781564] end connection 10.0.3.1:57415 (10 connections now open)
2019-08-06T10:52:32.556+0000 I NETWORK  [listener] connection accepted from 10.0.3.1:57535 #781568 (11 connections now open)
2019-08-06T10:52:33.966+0000 I NETWORK  [conn781565] end connection 10.0.0.4:60406 (10 connections now open)
2019-08-06T10:52:33.967+0000 I NETWORK  [listener] connection accepted from 10.0.0.4:60461 #781569 (11 connections now open)

sh.status()的输出:

--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("5d3a7c7d035b4525a7de5eaa")
  }
  shards:
        {  "_id" : "Shard1RepSet",  "host" : "Shard1RepSet/94.245.111.162:27017",  "state" : 1 }
        {  "_id" : "Shard2RepSet",  "host" : "Shard2RepSet/13.74.42.35:27017",  "state" : 1 }
  active mongoses:
        "4.0.10" : 1
  autosplit:
        Currently enabled: yes
  balancer:
        Currently enabled:  yes
        Currently running:  no
        Failed balancer rounds in last 5 attempts:  0
        Migration Results for the last 24 hours: 
                2 : Success
  databases:
#Databases sharding Information

我希望分片的mongodb不会在任何时间停止运行,并且工作方式类似于独立的mongodb。

有人可以指导我解决mongodb分片问题的拖延吗?

1 个答案:

答案 0 :(得分:0)

首先,如果使用dockerhub提供的mongo图像,则应同时指定envcommand,因为command会覆盖Enterypoint,在这种情况下,负责处理用户名和密码creation,因此它将不起作用。选中The command field corresponds to entrypoint in some container runtimes