Elasticsearch 2.4节点不会与ConnectTransportException形成集群

时间:2016-10-07 10:25:12

标签: nginx elasticsearch docker dockerfile elastic-stack

我已经在Elasticsearch(ES)1.7中使用带有3个节点的docker容器运行ELK堆栈,每个节点运行一个ES容器,在nginx服务器后面运行。现在我正在尝试将ES升级到2.4.0。 ES 2.4.0中不允许root用户,因此我使用-Des.root.insecure.allow=true选项。

#Pulling SLES12 thin base image
FROM private-registry-1

#Author
MAINTAINER xyz

# Pre-requisite - Adding repositories
RUN zypper ar private-registry-2

RUN zypper --no-gpg-checks -n refresh

#Install required packages and dependencies
RUN zypper -n in  net-tools-1.60-764.185 wget-1.14-7.1 python-2.7.9-14.1 python-base-2.7.9-14.1 tar-1.27.1-7.1 

#Downloading elasticsearch executable
ENV ES_VERSION=2.4.0
ENV ES_CLUSTER_NAME=ccs-elasticsearch
ENV ES_DIR="//opt//log-management//elasticsearch"
ENV ES_DATA_PATH="//data"
ENV ES_LOGS_PATH="//var//log"
ENV ES_CONFIG_PATH="${ES_DIR}//config"
ENV ES_REST_PORT=9200
ENV ES_INTERNAL_COM_PORT=9300

WORKDIR /opt/log-management
RUN wget private-registry-3/elasticsearch/elasticsearch/${ES_VERSION}.tar/elasticsearch-${ES_VERSION}.tar.gz --no-check-certificate
RUN tar -xzvf ${ES_DIR}-${ES_VERSION}.tar.gz \
&& rm ${ES_DIR}-${ES_VERSION}.tar.gz \
&& mv ${ES_DIR}-${ES_VERSION} ${ES_DIR} \
&& cp ${ES_DIR}/config/elasticsearch.yml ${ES_CONFIG_PATH}/elasticsearch-default.yml

#Exposing elasticsearch server container port to the HOST
EXPOSE ${ES_REST_PORT} ${ES_INTERNAL_COM_PORT}

#Removing binary files which are not needed
RUN zypper -n rm wget

# Removing zypper repos
RUN zypper rr caspiancs_common

COPY query-crs-es.sh ${ES_DIR}/bin/query-crs-es.sh
RUN chmod +x ${ES_DIR}/bin/query-crs-es.sh

COPY query-crs-wrapper.py ${ES_DIR}/bin/query-crs-wrapper.py
RUN chmod +x ${ES_DIR}/bin/query-crs-wrapper.py
ENV CRS_PARSER_PYTHON_SCRIPT="${ES_DIR}//bin//query-crs-wrapper.py"

#Copy elastic search bootstrap script
COPY elasticsearch-bootstrap-and-run.sh ${ES_DIR}/
RUN chmod +x ${ES_DIR}/elasticsearch-bootstrap-and-run.sh

COPY config-es-cluster ${ES_DIR}/bin/config-es-cluster
RUN chmod +x ${ES_DIR}/bin/config-es-cluster

COPY elasticsearch-config-script ${ES_DIR}/bin/elasticsearch-config-script
RUN chmod +x ${ES_DIR}/bin/elasticsearch-config-script

#Running elasticsearch executable
WORKDIR ${ES_DIR}
ENTRYPOINT ${ES_DIR}/elasticsearch-bootstrap-and-run.sh

配置文件将由Dockerfile中提到的elasticsearch-configconfig-es-cluster修改,如下所示:

#Bootstrap script to configure elasticsearch.yml file

echo "cluster.name: ${ES_CLUSTER_NAME}" > ${ES_CONFIG_PATH}/elasticsearch.yml
echo "path.data: ${ES_DATA_PATH}" >>   ${ES_CONFIG_PATH}/elasticsearch.yml
echo "path.logs: ${ES_LOGS_PATH}" >>   ${ES_CONFIG_PATH}/elasticsearch.yml

#Performance optimization settings
echo "index.number_of_replicas: 1" >> ${ES_CONFIG_PATH}/elasticsearch.yml
echo "index.number_of_shards: 3" >> ${ES_CONFIG_PATH}/elasticsearch.yml
#echo "discovery.zen.ping.multicast.enabled: false" >> ${ES_CONFIG_PATH}/elasticsearch.yml
#echo "bootstrap.mlockall: true" >> ${ES_CONFIG_PATH}/elasticsearch.yml
#echo "indices.memory.index_buffer_size: 50%" >> ${ES_CONFIG_PATH}/elasticsearch.yml


#Search thread pool
echo "threadpool.search.type: fixed" >> ${ES_CONFIG_PATH}/elasticsearch.yml
echo "threadpool.search.size: 20" >> ${ES_CONFIG_PATH}/elasticsearch.yml
echo "threadpool.search.queue_size: 100000" >> ${ES_CONFIG_PATH}/elasticsearch.yml

#Index thread pool
echo "threadpool.index.type: fixed" >> ${ES_CONFIG_PATH}/elasticsearch.yml
echo "threadpool.index.size: 60" >> ${ES_CONFIG_PATH}/elasticsearch.yml
echo "threadpool.index.queue_size: 200000" >> ${ES_CONFIG_PATH}/elasticsearch.yml

#publish host as container host address
#echo "network.publish_host: ${CONTAINER_HOST_ADDRESS}" >> ${ES_CONFIG_PATH}/elasticsearch.yml
#echo "network.bind_host: ${CONTAINER_HOST_ADDRESS}" >> ${ES_CONFIG_PATH}/elasticsearch.yml
#echo "network.publish_host: ${CONTAINER_PRIVATE_IP}" >> ${ES_CONFIG_PATH}/elasticsearch.yml
#echo "network.bind_host: ${CONTAINER_PRIVATE_IP}" >> ${ES_CONFIG_PATH}/elasticsearch.yml
#echo "network.host: ${CONTAINER_HOST_ADDRESS}" >> ${ES_CONFIG_PATH}/elasticsearch.yml
echo "network.host: 0.0.0.0" >> ${ES_CONFIG_PATH}/elasticsearch.yml
#echo "htpp.port: 9200" >> ${ES_CONFIG_PATH}/elasticsearch.yml
#echo "transport.tcp.port: 9300-9400" >> ${ES_CONFIG_PATH}/elasticsearch.yml
#configure elasticsearch.yml for clustering
echo 'discovery.zen.ping.unicast.hosts: [ELASTICSEARCH_IPS] ' >> ${ES_CONFIG_PATH}/elasticsearch.yml
echo "discovery.zen.minimum_master_nodes: 1" >> ${ES_CONFIG_PATH}/elasticsearch.yml

ELASTICSEARCH_IPS是其他节点的IP数组,由运行名为query-crs-es.sh的脚本的所有节点获取。最终,Array将拥有其他两个群集节点的IP。请注意,它们将是节点的IP,而不是容器专用IP。

当我尝试运行容器时,我使用ansible。在启动期间,所有节点都启动但无法形成群集。我一直都会遇到这些错误

节点1:

[2016-10-07 09:45:23,313][WARN ][bootstrap                ] running as ROOT user. this is a bad idea!
[2016-10-07 09:45:23,474][INFO ][node                     ] [Dragon Lord] version[2.4.0], pid[1], build[ce9f0c7/2016-08-29T09:14:17Z]
[2016-10-07 09:45:23,474][INFO ][node                     ] [Dragon Lord] initializing ...
[2016-10-07 09:45:23,970][INFO ][plugins                  ] [Dragon Lord] modules [reindex, lang-expression, lang-groovy], plugins [], sites []
[2016-10-07 09:45:23,994][INFO ][env                      ] [Dragon Lord] using [1] data paths, mounts [[/data (/dev/mapper/platform-data)]], net usable_space [2.5tb], net total_space [2.5tb], spins? [possibly], types [xfs]
[2016-10-07 09:45:23,994][INFO ][env                      ] [Dragon Lord] heap size [989.8mb], compressed ordinary object pointers [true]
[2016-10-07 09:45:24,028][WARN ][threadpool               ] [Dragon Lord] requested thread pool size [60] for [index] is too large; setting to maximum [32] instead
[2016-10-07 09:45:25,540][INFO ][node                     ] [Dragon Lord] initialized
[2016-10-07 09:45:25,540][INFO ][node                     ] [Dragon Lord] starting ...
[2016-10-07 09:45:25,687][INFO ][transport                ] [Dragon Lord] publish_address {172.17.0.15:9300}, bound_addresses {[::]:9300}
[2016-10-07 09:45:25,693][INFO ][discovery                ] [Dragon Lord] ccs-elasticsearch/5wNwWJRFRS-2dRY5AGqqGQ
[2016-10-07 09:45:28,721][INFO ][cluster.service          ] [Dragon Lord] new_master {Dragon Lord}{5wNwWJRFRS-2dRY5AGqqGQ}{172.17.0.15}{172.17.0.15:9300}, reason: zen-disco-join(elected_as_master, [0] joins received)
[2016-10-07 09:45:28,765][INFO ][http                     ] [Dragon Lord] publish_address {172.17.0.15:9200}, bound_addresses {[::]:9200}
[2016-10-07 09:45:28,765][INFO ][node                     ] [Dragon Lord] started
[2016-10-07 09:45:28,856][INFO ][gateway                  ] [Dragon Lord] recovered [20] indices into cluster_state

节点2:

[2016-10-07 09:45:58,561][WARN ][bootstrap                ] running as ROOT user. this is a bad idea!
[2016-10-07 09:45:58,729][INFO ][node                     ] [Defensor] version[2.4.0], pid[1], build[ce9f0c7/2016-08-29T09:14:17Z]
[2016-10-07 09:45:58,729][INFO ][node                     ] [Defensor] initializing ...
[2016-10-07 09:45:59,215][INFO ][plugins                  ] [Defensor] modules [reindex, lang-expression, lang-groovy], plugins [], sites []
[2016-10-07 09:45:59,237][INFO ][env                      ] [Defensor] using [1] data paths, mounts [[/data (/dev/mapper/platform-data)]], net usable_space [2.5tb], net total_space [2.5tb], spins? [possibly], types [xfs]
[2016-10-07 09:45:59,237][INFO ][env                      ] [Defensor] heap size [989.8mb], compressed ordinary object pointers [true]
[2016-10-07 09:45:59,266][WARN ][threadpool               ] [Defensor] requested thread pool size [60] for [index] is too large; setting to maximum [32] instead
[2016-10-07 09:46:00,733][INFO ][node                     ] [Defensor] initialized
[2016-10-07 09:46:00,733][INFO ][node                     ] [Defensor] starting ...
[2016-10-07 09:46:00,833][INFO ][transport                ] [Defensor] publish_address {172.17.0.16:9300}, bound_addresses {[::]:9300}
[2016-10-07 09:46:00,837][INFO ][discovery                ] [Defensor] ccs-elasticsearch/RXALMe9NQVmbCz5gg1CwHA
[2016-10-07 09:46:03,876][WARN ][discovery.zen            ] [Defensor] failed to connect to master [{Dragon Lord}{5wNwWJRFRS-2dRY5AGqqGQ}{172.17.0.15}{172.17.0.15:9300}], retrying...
ConnectTransportException[[Dragon Lord][172.17.0.15:9300] connect_timeout[30s]]; nested: ConnectException[Connection refused: /172.17.0.15:9300];
    at org.elasticsearch.transport.netty.NettyTransport.connectToChannels(NettyTransport.java:1002)
    at org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:937)
    at org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:911)
    at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:260)
    at org.elasticsearch.discovery.zen.ZenDiscovery.joinElectedMaster(ZenDiscovery.java:444)
    at org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:396)
    at org.elasticsearch.discovery.zen.ZenDiscovery.access$4400(ZenDiscovery.java:96)
    at org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1296)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused: /172.17.0.15:9300
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
    at org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152)
    at org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
    at org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79)
    at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
    at org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
    at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
    at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
    ... 3 more
[2016-10-07 09:46:06,899][WARN ][discovery.zen            ] [Defensor] failed to connect to master [{Dragon Lord}{5wNwWJRFRS-2dRY5AGqqGQ}{172.17.0.15}{172.17.0.15:9300}], retrying...
ConnectTransportException[[Dragon Lord][172.17.0.15:9300] connect_timeout[30s]]; nested: ConnectException[Connection refused: /172.17.0.15:9300];
    at org.elasticsearch.transport.netty.NettyTransport.connectToChannels(NettyTransport.java:1002)
    at org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:937)
    at org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:911)
    at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:260)
    at org.elasticsearch.discovery.zen.ZenDiscovery.joinElectedMaster(ZenDiscovery.java:444)
    at org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:396)
    at org.elasticsearch.discovery.zen.ZenDiscovery.access$4400(ZenDiscovery.java:96)
    at org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1296)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused: /172.17.0.15:9300
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
    at org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152)
    at org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
    at org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79)
    at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
    at org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
    at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
    at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
    ... 3 more
[2016-10-07 09:46:09,917][WARN ][discovery.zen            ] [Defensor] failed to connect to master [{Dragon Lord}{5wNwWJRFRS-2dRY5AGqqGQ}{172.17.0.15}{172.17.0.15:9300}], retrying...
ConnectTransportException[[Dragon Lord][172.17.0.15:9300] connect_timeout[30s]]; nested: ConnectException[Connection refused: /172.17.0.15:9300];

节点3:

[2016-10-07 09:45:58,624][WARN ][bootstrap                ] running as ROOT user. this is a bad idea!
[2016-10-07 09:45:58,806][INFO ][node                     ] [Scarlet Beetle] version[2.4.0], pid[1], build[ce9f0c7/2016-08-29T09:14:17Z]
[2016-10-07 09:45:58,806][INFO ][node                     ] [Scarlet Beetle] initializing ...
[2016-10-07 09:45:59,341][INFO ][plugins                  ] [Scarlet Beetle] modules [reindex, lang-expression, lang-groovy], plugins [], sites []
[2016-10-07 09:45:59,363][INFO ][env                      ] [Scarlet Beetle] using [1] data paths, mounts [[/data (/dev/mapper/platform-data)]], net usable_space [2.5tb], net total_space [2.5tb], spins? [possibly], types [xfs]
[2016-10-07 09:45:59,363][INFO ][env                      ] [Scarlet Beetle] heap size [989.8mb], compressed ordinary object pointers [true]
[2016-10-07 09:45:59,390][WARN ][threadpool               ] [Scarlet Beetle] requested thread pool size [60] for [index] is too large; setting to maximum [32] instead
[2016-10-07 09:46:00,795][INFO ][node                     ] [Scarlet Beetle] initialized
[2016-10-07 09:46:00,795][INFO ][node                     ] [Scarlet Beetle] starting ...
[2016-10-07 09:46:00,927][INFO ][transport                ] [Scarlet Beetle] publish_address {172.17.0.16:9300}, bound_addresses {[::]:9300}
[2016-10-07 09:46:00,931][INFO ][discovery                ] [Scarlet Beetle] ccs-elasticsearch/SFWrVwKRSUu--4KiZK4Kfg
[2016-10-07 09:46:03,965][WARN ][discovery.zen            ] [Scarlet Beetle] failed to connect to master [{Dragon Lord}{5wNwWJRFRS-2dRY5AGqqGQ}{172.17.0.15}{172.17.0.15:9300}], retrying...
ConnectTransportException[[Dragon Lord][172.17.0.15:9300] connect_timeout[30s]]; nested: ConnectException[Connection refused: /172.17.0.15:9300];
    at org.elasticsearch.transport.netty.NettyTransport.connectToChannels(NettyTransport.java:1002)
    at org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:937)
    at org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:911)
    at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:260)
    at org.elasticsearch.discovery.zen.ZenDiscovery.joinElectedMaster(ZenDiscovery.java:444)
    at org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:396)
    at org.elasticsearch.discovery.zen.ZenDiscovery.access$4400(ZenDiscovery.java:96)
    at org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1296)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused: /172.17.0.15:9300
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
    at org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152)
    at org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
    at org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79)
    at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
    at org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
    at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
    at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
    ... 3 more
[2016-10-07 09:46:06,990][WARN ][discovery.zen            ] [Scarlet Beetle] failed to connect to master [{Dragon Lord}{5wNwWJRFRS-2dRY5AGqqGQ}{172.17.0.15}{172.17.0.15:9300}], retrying...
ConnectTransportException[[Dragon Lord][172.17.0.15:9300] connect_timeout[30s]]; nested: ConnectException[Connection refused: /172.17.0.15:9300];
    at org.elasticsearch.transport.netty.NettyTransport.connectToChannels(NettyTransport.java:1002)
    at org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:937)
    at org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:911)
    at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:260)
    at org.elasticsearch.discovery.zen.ZenDiscovery.joinElectedMaster(ZenDiscovery.java:444)
    at org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:396)
    at org.elasticsearch.discovery.zen.ZenDiscovery.access$4400(ZenDiscovery.java:96)
    at org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1296)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

从日志中可以看出,节点2和3知道master,Node1,但无法连接。我已经尝试了大多数关于network.host的配置,您可以在配置代码中看到这些配置并且它们都不起作用。 任何线索都将受到赞赏。

这是港口的状态:

 netstat -nlp | grep 9200
    tcp        0      0 10.240.135.140:9200     0.0.0.0:*               LISTEN      188116/docker-proxy 
    tcp        0      0 10.240.137.112:9200     0.0.0.0:*               LISTEN      187240/haproxy 

netstat -nlp | grep 9300
tcp        0      0 :::9300                 :::*                    LISTEN      188085/docker-proxy 

1 个答案:

答案 0 :(得分:0)

我能够使用以下设置形成群集

network.publish_host=CONTAINER_HOST_ADDRESS即节点的地址 容器正在运行。

network.bind_host=0.0.0.0
transport.publish_port=9300
transport.publish_host=CONTAINER_HOST_ADDRESS
当您在代理/负载均衡器(例如nginx或haproxy)之后运行ES时,

tranport.publish_host非常重要。