解决了Elasticsearch 7.2集群遇到未分配的分片的问题

时间:2019-07-19 06:45:14

标签: elasticsearch cluster-computing sharding

我想使用7.2版本构建三个节点的Elasticsearch集群,但是出乎意料。

我有三个虚拟机:192.168.7.2、192.168.7.3、192.168.7.4,它们的主要配置位于select to_char(to_date(t2.M_DATE, 'DD-MON-YY'),'IW') week1, round(avg(t1.my_data),0) md from m_table t1 JOIN M_TABLE_MASTER t2 ON t1.b_no = t2.b_no where t2.M_DATE >'01-JAN-19' group by to_char(to_date(t2.M_DATE, 'DD-MON-YY'),'IW') order by week1; 中:

  • 192.168.7.2:
config/elasticsearch.yml
  • 192.168.7.3:
cluster.name: ucas
node.name: node-2
network.host: 192.168.7.2
http.port: 9200
discovery.seed_hosts: ["192.168.7.2", "192.168.7.3", "192.168.7.4"]
cluster.initial_master_nodes: ["node-2", "node-3", "node-4"]
http.cors.enabled: true
http.cors.allow-origin: "*"
  • 192.168.7.4:
cluster.name: ucas
node.name: node-3
network.host: 192.168.7.3
http.port: 9200
discovery.seed_hosts: ["192.168.7.2", "192.168.7.3", "192.168.7.4"]
cluster.initial_master_nodes: ["node-2", "node-3", "node-4"]

当我启动每个节点时,创建一个名为movie的索引,其中包含3个分片和0个副本,然后将一些文档写入索引,群集看起来很正常:

cluster.name: ucas
node.name: node-4
network.host: 192.168.7.4
http.port: 9200
discovery.seed_hosts: ["192.168.7.2", "192.168.7.3", "192.168.7.4"]
cluster.initial_master_nodes: ["node-2", "node-3", "node-4"]

enter image description here

然后,将PUT moive { "settings": { "number_of_shards": 3, "number_of_replicas": 0 } } PUT moive/_doc/3 { "title":"title 3" } 副本设置为1:

movie

enter image description here

一切正常,但是当我将PUT moive/_settings { "number_of_replicas": 1 } 副本设置为2时:

movie

enter image description here

不能将新副本分配给node2。

我不知道哪一步不正确,请帮忙谈谈。

1 个答案:

答案 0 :(得分:0)

首先使用说明命令找出无法分配分片的原因:


GET _cluster/allocation/explain?pretty



{
  "index" : "moive",
  "shard" : 2,
  "primary" : false,
  "current_state" : "unassigned",
  "unassigned_info" : {
    "reason" : "NODE_LEFT",
    "at" : "2019-07-19T06:47:29.704Z",
    "details" : "node_left [tIm8GrisRya8jl_n9lc3MQ]",
    "last_allocation_status" : "no_attempt"
  },
  "can_allocate" : "no",
  "allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes",
  "node_allocation_decisions" : [
    {
      "node_id" : "kQ0Noq8LSpyEcVDF1POfJw",
      "node_name" : "node-3",
      "transport_address" : "192.168.7.3:9300",
      "node_attributes" : {
        "ml.machine_memory" : "5033172992",
        "ml.max_open_jobs" : "20",
        "xpack.installed" : "true"
      },
      "node_decision" : "no",
      "store" : {
        "matching_sync_id" : true
      },
      "deciders" : [
        {
          "decider" : "same_shard",
          "decision" : "NO",
          "explanation" : "the shard cannot be allocated to the same node on which a copy of the shard already exists [[moive][2], node[kQ0Noq8LSpyEcVDF1POfJw], [R], s[STARTED], a[id=Ul73SPyaTSyGah7Yl3k2zA]]"
        }
      ]
    },
    {
      "node_id" : "mNpqD9WPRrKsyntk2GKHMQ",
      "node_name" : "node-4",
      "transport_address" : "192.168.7.4:9300",
      "node_attributes" : {
        "ml.machine_memory" : "5033172992",
        "ml.max_open_jobs" : "20",
        "xpack.installed" : "true"
      },
      "node_decision" : "no",
      "store" : {
        "matching_sync_id" : true
      },
      "deciders" : [
        {
          "decider" : "same_shard",
          "decision" : "NO",
          "explanation" : "the shard cannot be allocated to the same node on which a copy of the shard already exists [[moive][2], node[mNpqD9WPRrKsyntk2GKHMQ], [P], s[STARTED], a[id=yQo1HUqoSdecD-SZyYMYfg]]"
        }
      ]
    },
    {
      "node_id" : "tIm8GrisRya8jl_n9lc3MQ",
      "node_name" : "node-2",
      "transport_address" : "192.168.7.2:9300",
      "node_attributes" : {
        "ml.machine_memory" : "5033172992",
        "ml.max_open_jobs" : "20",
        "xpack.installed" : "true"
      },
      "node_decision" : "no",
      "deciders" : [
        {
          "decider" : "disk_threshold",
          "decision" : "NO",
          "explanation" : "the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=85%], using more disk space than the maximum allowed [85.0%], actual free: [2.2790256709451573E-4%]"
        }
      ]
    }
  ]
}

我们可以看到节点2的磁盘空间已满:

[vagrant@node2 ~]$ df -h
Filesystem               Size  Used Avail Use% Mounted on
/dev/mapper/centos-root  8.4G  8.0G  480M  95% /
devtmpfs                 2.4G     0  2.4G   0% /dev
tmpfs                    2.4G     0  2.4G   0% /dev/shm
tmpfs                    2.4G  8.4M  2.4G   1% /run
tmpfs                    2.4G     0  2.4G   0% /sys/fs/cgroup
/dev/sda1                497M  118M  379M  24% /boot
none                     234G  149G   86G  64% /vagrant

然后我清理磁盘空间,一切恢复正常: enter image description here