Elasticsearch主节点不断连接和断开

时间:2015-11-10 14:05:23

标签: elasticsearch rackspace

我经常在我的日志中收到这些错误消息:

[2015-11-10 13:52:03,037][WARN ][discovery.zen.ping.unicast] [ClusterUK Node 1] [11] failed send ping to [ClusterUK Node 1][x-eBYFoiRemOBK7egMHTRg][elasticuk1][inet[/172.24.32.10:9300]]{master=true}
org.elasticsearch.ElasticsearchIllegalStateException: can't add nodes to a stopped transport
    at org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:746)
    at org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:731)
    at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:216)
    at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing$3.run(UnicastZenPing.java:376)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)
[2015-11-10 13:52:03,038][WARN ][discovery.zen.ping.unicast] [ClusterUK Node 1] [12] failed send ping to [ClusterUK Node 1][x-eBYFoiRemOBK7egMHTRg][elasticuk1][inet[/172.24.32.10:9300]]{master=true}
org.elasticsearch.ElasticsearchIllegalStateException: can't add nodes to a stopped transport
    at org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:746)
    at org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:731)
    at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:216)
    at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing$3.run(UnicastZenPing.java:376)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)
[2015-11-10 13:52:03,038][WARN ][discovery.zen.ping.unicast] [ClusterUK Node 1] [12] failed send ping to [ClusterUK Node 1][x-eBYFoiRemOBK7egMHTRg][elasticuk1][inet[/172.24.32.10:9300]]{master=true}
org.elasticsearch.ElasticsearchIllegalStateException: can't add nodes to a stopped transport
    at org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:746)
    at org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:731)
    at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:216)
    at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing$3.run(UnicastZenPing.java:376)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)
[2015-11-10 13:52:11,378][INFO ][transport                ] [ClusterUK Node 1] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/172.24.32.10:9300]}
[2015-11-10 13:52:11,394][INFO ][discovery                ] [ClusterUK Node 1] ClusterUK/FTiLxRmZQLyFtyap8JTj2w
[2015-11-10 13:52:14,498][INFO ][cluster.service          ] [ClusterUK Node 1] detected_master [ClusterUK Node 2][T5R_1SUwRu6Q4zZLMTbNlA][elasticuk2][inet[/172.24.32.5:9300]]{master=true}, added {[ClusterUK Client Node STG1][_JfbrXjFTzGD7BL7OTqbVA][Staging1][inet[/192.168.100.248:9300]]{data=false, master=false},[ClusterUK Node 3][rHJ486YyQHqKytG44fmC7g][elasticuk3][inet[/172.24.32.8:9300]]{master=true},[ClusterUK Node 2][T5R_1SUwRu6Q4zZLMTbNlA][elasticuk2][inet[/172.24.32.5:9300]]{master=true},}, reason: zen-disco-receive(from master [[ClusterUK Node 2][T5R_1SUwRu6Q4zZLMTbNlA][elasticuk2][inet[/172.24.32.5:9300]]{master=true}])
[2015-11-10 13:52:14,749][INFO ][http                     ] [ClusterUK Node 1] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/172.24.32.10:9200]}
[2015-11-10 13:52:14,750][INFO ][node                     ] [ClusterUK Node 1] started
[2015-11-10 13:52:44,994][INFO ][discovery.zen            ] [ClusterUK Node 1] master_left [[ClusterUK Node 2][T5R_1SUwRu6Q4zZLMTbNlA][elasticuk2][inet[/172.24.32.5:9300]]{master=true}], reason [do not exists on master, act as master failure]
[2015-11-10 13:52:44,996][WARN ][discovery.zen            ] [ClusterUK Node 1] master left (reason = do not exists on master, act as master failure), current nodes: {[ClusterUK Client Node STG1][_JfbrXjFTzGD7BL7OTqbVA][Staging1][inet[/192.168.100.248:9300]]{data=false, master=false},[ClusterUK Node 1][FTiLxRmZQLyFtyap8JTj2w][elasticuk1][inet[elasticuk1/172.24.32.10:9300]]{master=true},[ClusterUK Node 3][rHJ486YyQHqKytG44fmC7g][elasticuk3][inet[/172.24.32.8:9300]]{master=true},}
[2015-11-10 13:52:44,996][INFO ][cluster.service          ] [ClusterUK Node 1] removed {[ClusterUK Node 2][T5R_1SUwRu6Q4zZLMTbNlA][elasticuk2][inet[/172.24.32.5:9300]]{master=true},}, reason: zen-disco-master_failed ([ClusterUK Node 2][T5R_1SUwRu6Q4zZLMTbNlA][elasticuk2][inet[/172.24.32.5:9300]]{master=true})
[2015-11-10 13:52:48,047][INFO ][cluster.service          ] [ClusterUK Node 1] detected_master [ClusterUK Node 2][T5R_1SUwRu6Q4zZLMTbNlA][elasticuk2][inet[/172.24.32.5:9300]]{master=true}, added {[ClusterUK Node 2][T5R_1SUwRu6Q4zZLMTbNlA][elasticuk2][inet[/172.24.32.5:9300]]{master=true},}, reason: zen-disco-receive(from master [[ClusterUK Node 2][T5R_1SUwRu6Q4zZLMTbNlA][elasticuk2][inet[/172.24.32.5:9300]]{master=true}])
[2015-11-10 13:53:10,689][INFO ][cluster.service          ] [ClusterUK Node 1] removed {[ClusterUK Node 3][rHJ486YyQHqKytG44fmC7g][elasticuk3][inet[/172.24.32.8:9300]]{master=true},}, reason: zen-disco-receive(from master [[ClusterUK Node 2][T5R_1SUwRu6Q4zZLMTbNlA][elasticuk2][inet[/172.24.32.5:9300]]{master=true}])
[2015-11-10 13:53:13,199][INFO ][cluster.service          ] [ClusterUK Node 1] added {[ClusterUK Node 3][rHJ486YyQHqKytG44fmC7g][elasticuk3][inet[/172.24.32.8:9300]]{master=true},}, reason: zen-disco-receive(from master [[ClusterUK Node 2][T5R_1SUwRu6Q4zZLMTbNlA][elasticuk2][inet[/172.24.32.5:9300]]{master=true}])
[2015-11-10 13:53:35,963][INFO ][discovery.zen            ] [ClusterUK Node 1] master_left [[ClusterUK Node 2][T5R_1SUwRu6Q4zZLMTbNlA][elasticuk2][inet[/172.24.32.5:9300]]{master=true}], reason [transport disconnected]
[2015-11-10 13:53:35,964][WARN ][discovery.zen            ] [ClusterUK Node 1] master left (reason = transport disconnected), current nodes: {[ClusterUK Client Node STG1][_JfbrXjFTzGD7BL7OTqbVA][Staging1][inet[/192.168.100.248:9300]]{data=false, master=false},[ClusterUK Node 1][FTiLxRmZQLyFtyap8JTj2w][elasticuk1][inet[elasticuk1/172.24.32.10:9300]]{master=true},[ClusterUK Node 3][rHJ486YyQHqKytG44fmC7g][elasticuk3][inet[/172.24.32.8:9300]]{master=true},}
[2015-11-10 13:53:35,965][INFO ][cluster.service          ] [ClusterUK Node 1] removed {[ClusterUK Node 2][T5R_1SUwRu6Q4zZLMTbNlA][elasticuk2][inet[/172.24.32.5:9300]]{master=true},}, reason: zen-disco-master_failed ([ClusterUK Node 2][T5R_1SUwRu6Q4zZLMTbNlA][elasticuk2][inet[/172.24.32.5:9300]]{master=true})
[2015-11-10 13:53:39,018][INFO ][cluster.service          ] [ClusterUK Node 1] detected_master [ClusterUK Node 2][T5R_1SUwRu6Q4zZLMTbNlA][elasticuk2][inet[/172.24.32.5:9300]]{master=true}, added {[ClusterUK Node 2][T5R_1SUwRu6Q4zZLMTbNlA][elasticuk2][inet[/172.24.32.5:9300]]{master=true},}, reason: zen-disco-receive(from master [[ClusterUK Node 2][T5R_1SUwRu6Q4zZLMTbNlA][elasticuk2][inet[/172.24.32.5:9300]]{master=true}])
[2015-11-10 13:54:03,581][INFO ][discovery.zen            ] [ClusterUK Node 1] master_left [[ClusterUK Node 2][T5R_1SUwRu6Q4zZLMTbNlA][elasticuk2][inet[/172.24.32.5:9300]]{master=true}], reason [transport disconnected]
[2015-11-10 13:54:03,581][WARN ][discovery.zen            ] [ClusterUK Node 1] master left (reason = transport disconnected), current nodes: {[ClusterUK Client Node STG1][_JfbrXjFTzGD7BL7OTqbVA][Staging1][inet[/192.168.100.248:9300]]{data=false, master=false},[ClusterUK Node 1][FTiLxRmZQLyFtyap8JTj2w][elasticuk1][inet[elasticuk1/172.24.32.10:9300]]{master=true},[ClusterUK Node 3][rHJ486YyQHqKytG44fmC7g][elasticuk3][inet[/172.24.32.8:9300]]{master=true},}
[2015-11-10 13:54:03,581][INFO ][cluster.service          ] [ClusterUK Node 1] removed {[ClusterUK Node 2][T5R_1SUwRu6Q4zZLMTbNlA][elasticuk2][inet[/172.24.32.5:9300]]{master=true},}, reason: zen-disco-master_failed ([ClusterUK Node 2][T5R_1SUwRu6Q4zZLMTbNlA][elasticuk2][inet[/172.24.32.5:9300]]{master=true})
[2015-11-10 13:54:06,603][INFO ][cluster.service          ] [ClusterUK Node 1] detected_master [ClusterUK Node 2][T5R_1SUwRu6Q4zZLMTbNlA][elasticuk2][inet[/172.24.32.5:9300]]{master=true}, added {[ClusterUK Node 2][T5R_1SUwRu6Q4zZLMTbNlA][elasticuk2][inet[/172.24.32.5:9300]]{master=true},}, reason: zen-disco-receive(from master [[ClusterUK Node 2][T5R_1SUwRu6Q4zZLMTbNlA][elasticuk2][inet[/172.24.32.5:9300]]{master=true}])
[2015-11-10 13:54:39,790][INFO ][discovery.zen            ] [ClusterUK Node 1] master_left [[ClusterUK Node 2][T5R_1SUwRu6Q4zZLMTbNlA][elasticuk2][inet[/172.24.32.5:9300]]{master=true}], reason [transport disconnected]
[2015-11-10 13:54:39,792][WARN ][discovery.zen            ] [ClusterUK Node 1] master left (reason = transport disconnected), current nodes: {[ClusterUK Client Node STG1][_JfbrXjFTzGD7BL7OTqbVA][Staging1][inet[/192.168.100.248:9300]]{data=false, master=false},[ClusterUK Node 1][FTiLxRmZQLyFtyap8JTj2w][elasticuk1][inet[elasticuk1/172.24.32.10:9300]]{master=true},[ClusterUK Node 3][rHJ486YyQHqKytG44fmC7g][elasticuk3][inet[/172.24.32.8:9300]]{master=true},}
[2015-11-10 13:54:39,792][INFO ][cluster.service          ] [ClusterUK Node 1] removed {[ClusterUK Node 2][T5R_1SUwRu6Q4zZLMTbNlA][elasticuk2][inet[/172.24.32.5:9300]]{master=true},}, reason: zen-disco-master_failed ([ClusterUK Node 2][T5R_1SUwRu6Q4zZLMTbNlA][elasticuk2][inet[/172.24.32.5:9300]]{master=true})
[2015-11-10 13:54:42,366][ERROR][marvel.agent.exporter    ] [ClusterUK Node 1] remote target didn't respond with 200 OK response code [503 Service Unavailable]. content: [:)
��error�ClusterBlockException[blocked by: [SERVICE_UNAVAILABLE/2/no master];]��status$��]

那是我的elasticsearch.yml文件:

action.disable_delete_all_indices: true

cluster.name: ClusterUK

network.publish_host: "172.24.32.10"

discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["172.24.32.10", "172.24.32.5", "172.24.32.8"]

indices.fielddata.cache.size: 25%
indices.cluster.send_refresh_mapping: false

node.name: "ClusterUK Node 1" 
node.master: true
node.data: true

bootstrap.mlockall: true

在某些情况下,它会使Elasticsearch无法作为服务运行(几秒钟)。

这当前在Rackspace中运行,我认为可能涉及网络问题(但是,我绑定到特定的IP地址并使用单播)。

有4个节点在那里运行(3个使用master = true,data = true和一个客户端节点)。

有人可以让我了解那里发生的事情吗? Windows Server上的1.7.3版(客户端节点1.7.1)。

我怀疑这个问题来自master left (reason = transport disconnected)并且它是一个裂脑,但我该如何解决?

1 个答案:

答案 0 :(得分:1)

我能够找到问题所在。 Elasticsearch不容忍TCP Offloading

  

TCP卸载引擎是网络接口卡(NIC)中使用的功能   卸载整个TCP / IP堆栈到网络的处理   控制器。通过将部分或全部处理移至专用处理   硬件,TCP卸载引擎释放系统的主CPU为其他   任务。但是,已知TCP卸载会导致一些问题,   并禁用它可以帮助避免这些问题。

禁用TCP卸载

  1. 在Windows服务器中,打开“控制面板”,然后选择网络 设置> 更改适配器设置
  2. Screenshot

    1. 右键点击每个适配器(私有公开),然后选择 网络菜单中的配置,然后点击高级标签。 为Citrix适配器列出了TCP卸载设置。
    2. Screenshot

      1. 禁用以下每个TCP卸载选项,然后单击“确定” 确定
        • IPv4 Checksum Offload
        • 大型接收卸载
        • 大型发送卸载
        • TCP Checksum Offload
      2. 这解决了我的问题。