Question

有关CentOS的弹性搜索1.7.2

我们有一个运行正常的3节点集群。网络问题导致“B”节点失去网络访问。（然后证明C节点的“minimum_master_nodes”为1，而不是2。）

所以我们现在正在与A节点一起探索。

我们修复了B和C节点上的问题，但他们拒绝启动并加入群集。在B和C：

# curl -XGET http://localhost:9200/_cluster/health?pretty=true
{
  "error" : "MasterNotDiscoveredException[waited for [30s]]",
  "status" : 503
}

elasticsearch.yml如下（“b”和“c”节点上的名称反映在这些系统上的节点名称中，另外，每个节点上的IP地址反映了其他2个节点，但是， “c”节点，index.number_of_replicas被错误地设置为1.）

cluster.name: elasticsearch-prod

node.name: "PROD-node-3a"

node.master: true

index.number_of_replicas: 2

discovery.zen.minimum_master_nodes: 2

discovery.zen.ping.multicast.enabled: false

discovery.zen.ping.unicast.hosts: ["192.168.3.100", "192.168.3.101"]

我们不知道他们为什么不加入。他们对A有网络可见性，A可以看到它们。每个节点都正确地在“discovery.zen.ping.unicast.hosts：”

中定义了另外两个节点

在B和C上，日志非常稀疏，并且没有告诉我们任何内容：

    # cat elasticsearch.log
[2015-09-24 20:07:46,686][INFO ][node                     ] [The Profile] version[1.7.2], pid[866], build[e43676b/2015-09-14T09:49:53Z]
[2015-09-24 20:07:46,688][INFO ][node                     ] [The Profile] initializing ...
[2015-09-24 20:07:46,931][INFO ][plugins                  ] [The Profile] loaded [], sites []
[2015-09-24 20:07:47,054][INFO ][env                      ] [The Profile] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [148.7gb], net total_space [157.3gb], types [rootfs]
[2015-09-24 20:07:50,696][INFO ][node                     ] [The Profile] initialized
[2015-09-24 20:07:50,697][INFO ][node                     ] [The Profile] starting ...
[2015-09-24 20:07:50,942][INFO ][transport                ] [The Profile] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/10.181.3.138:9300]}
[2015-09-24 20:07:50,983][INFO ][discovery                ] [The Profile] elasticsearch/PojoIp-ZTXufX_Lxlwvdew
[2015-09-24 20:07:54,772][INFO ][cluster.service          ] [The Profile] new_master [The Profile][PojoIp-ZTXufX_Lxlwvdew][elastic-search-3c-prod-centos-case-48307][inet[/10.181.3.138:9300]], reason: zen-disco-join (elected_as_master)
[2015-09-24 20:07:54,801][INFO ][http                     ] [The Profile] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/10.181.3.138:9200]}
[2015-09-24 20:07:54,802][INFO ][node                     ] [The Profile] started
[2015-09-24 20:07:54,880][INFO ][gateway                  ] [The Profile] recovered [0] indices into cluster_state
[2015-09-24 20:42:45,691][INFO ][node                     ] [The Profile] stopping ...
[2015-09-24 20:42:45,727][INFO ][node                     ] [The Profile] stopped
[2015-09-24 20:42:45,727][INFO ][node                     ] [The Profile] closing ...
[2015-09-24 20:42:45,735][INFO ][node                     ] [The Profile] closed

我们如何将整个野兽变为现实？

重新启动B和C完全没有区别
我对自行车A犹豫不决，因为这就是我们的应用程序所要达到的......

Answer 1

嗯，我们不知道它带来了什么，但它有点神奇地回来了。

我相信分片重新路由（显示在这里：elasticsearch: Did I lose data when two of my three nodes went down?）导致节点重新加入群集。我们的理论是节点A，唯一幸存的节点，不是健康的＃34;掌握，因为它知道一个碎片（＆＃34; p＆＃34;碎片1的切割，如这里拼写的那样：elasticsearch: Did I lose data when two of my three nodes went down?）没有被分配。

由于主人知道它不完整，其他节点拒绝加入群集，抛出＆＃34; MasterNotDiscoveredException＆＃34;

一旦我们得到所有的＆＃34; p＆＃34;分配给幸存的A节点的分片，其他节点连接起来，并完成整个复制舞蹈。

但是，通过分配这样的分片会丢失数据。我们最终设置了一个新的集群，并正在重建索引（需要几天时间）。

elasticsearch：如何重新初始化节点？

1 个答案: