我有一个有2个节点的集群。
我正在尝试了解连接节点的最佳做法,并在一个节点上出现停机时检查故障转移。
es = Elasticsearch(
['esnode1', 'esnode2'],
# sniff before doing anything
sniff_on_start=True,
# refresh nodes after a node fails to respond
sniff_on_connection_fail=True,
# and also every 60 seconds
sniffer_timeout=60
)
所以我试图像这样连接到我的节点:
client = Elasticsearch([ip1, ip2],sniff_on_start=True, sniffer_timeout=10,sniff_on_connection_fail=True)
其中ip1 / ip2是机器ip(例如10.0.0.1,10.0.0.2)
为了测试它,我终止了ip2(或者如果不存在则) 现在,当我想要连接时,我总是得到:
TransportError: TransportError(N/A, 'Unable to sniff hosts - no viable hosts found.')
即使ip1存在也存在。
如果我想这样连接:
es = Elasticsearch([ip1, ip2])
然后我可以在日志中看到,如果客户端没有从ip2获得任何响应,它将移动到ip1,并返回有效响应。
所以我在这里错过了什么?我认为通过嗅探,如果其中一个节点关闭,客户端不会抛出任何异常,并继续使用活动节点(直到下一次嗅探)
更新: 当我设置嗅到真实'
时,我会得到这种行为----> 1 client = Elasticsearch([ip1, ip2],sniff_on_start=True)
/usr/local/lib/python2.7/dist-packages/elasticsearch/client/__init__.pyc in __init__(self, hosts, transport_class, **kwargs)
148 :class:`~elasticsearch.Connection` instances.
149 """
--> 150 self.transport = transport_class(_normalize_hosts(hosts), **kwargs)
151
152 # namespaced clients for compatibility with API names
/usr/local/lib/python2.7/dist-packages/elasticsearch/transport.pyc in __init__(self, hosts, connection_class, connection_pool_class, host_info_callback, sniff_on_start, sniffer_timeout, sniff_timeout, sniff_on_connection_fail, serializer, serializers, default_mimetype, max_retries, retry_on_status, retry_on_timeout, send_get_body_as, **kwargs)
128
129 if sniff_on_start:
--> 130 self.sniff_hosts(True)
131
132 def add_connection(self, host):
/usr/local/lib/python2.7/dist-packages/elasticsearch/transport.pyc in sniff_hosts(self, initial)
235 # transport_schema or host_info_callback blocked all - raise error.
236 if not hosts:
--> 237 raise TransportError("N/A", "Unable to sniff hosts - no viable hosts found.")
238
239 self.set_connections(hosts)
答案 0 :(得分:0)
您需要将sniff_timeout
设置为高于默认值的值(如果内存提供的话,则为0.1)。
尝试这样
es = Elasticsearch(
['esnode1', 'esnode2'],
# sniff before doing anything
sniff_on_start=True,
# refresh nodes after a node fails to respond
sniff_on_connection_fail=True,
# and also every 60 seconds
sniffer_timeout=60,
# set sniffing request timeout to 10 seconds
sniff_timeout=10
)