我正在尝试使用CentOS在CentOS 7.1上安装的etcd 2.3.7设置测试集群。在 Loader 1 我执行了:
etcdctl member add loader2 http://10.11.51.231:2380
并收到确认操作成功完成的回复。
类似地:
etcdctl member add loader3 http://10.11.51.231:2380
使用所有默认设置,以及我看到的内容:
systemctl status etcd -ln1
etcd.service - Etcd Server
Loaded: loaded (/usr/lib/systemd/system/etcd.service; disabled)
Active: active (running) since Sun 2017-02-19 14:33:18 IST; 28min ago
Main PID: 19009 (etcd)
CGroup: /system.slice/etcd.service
└─19009 /usr/bin/etcd --name=default --data-dir=/var/lib/etcd/default.etcd --listen-client-urls=http://localhost:2379
Feb 19 15:02:03 loader3 etcd[19009]: cannot get the version of member a4803061db803edc (Get http://10.11.51.166:2380/version: dial tcp 10.11.51.166:2380: getsockopt: connection refused)
试图看到群集健康状况:
etcdctl --debug cluster-health
Cluster-Endpoints: http://127.0.0.1:4001, http://127.0.0.1:2379
cURL Command: curl -X GET http://127.0.0.1:4001/v2/members
cURL Command: curl -X GET http://127.0.0.1:2379/v2/members
member ce2a822cea30bfca is unhealthy: got unhealthy result from http://localhost:2379
member da05b63349d818dc is unreachable: no available published client urls
cluster is unhealthy
请注意,这会忽略先前添加的两个节点,但会将请求发送到localhost上的随机端口...
起初这台机器启动正常,但在我看到 Loader 1 出现问题后,我尝试将 Loader 1 添加为此机器的成员,现在我也在这台机器上看到了相同的图片。即它试图查询这个4001端口,没有人响应。在所有机器上:
netstat -tupln | grep etcd
tcp 0 0 127.0.0.1:7001 0.0.0.0:* LISTEN 4507/etcd
tcp 0 0 127.0.0.1:2379 0.0.0.0:* LISTEN 4507/etcd
tcp 0 0 127.0.0.1:2380 0.0.0.0:* LISTEN 4507/etcd
没有人听4001 ......
在这个装载机上,我没有尝试添加新成员。所以它看起来像这样:
etcdctl --debug cluster-health
Cluster-Endpoints: http://127.0.0.1:4001, http://127.0.0.1:2379
cURL Command: curl -X GET http://127.0.0.1:4001/v2/members
cURL Command: curl -X GET http://127.0.0.1:2379/v2/members
member ce2a822cea30bfca is healthy: got healthy result from http://localhost:2379
cluster is healthy
换句话说,它仍然会向随机端口发送请求,但这次没有人回复的事情让它感到困扰......
以下是配置文件的内容:
cat /usr/lib/systemd/system/etcd.service
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
[Service]
Type=notify
WorkingDirectory=/var/lib/etcd/
EnvironmentFile=-/etc/etcd/etcd.conf
User=etcd
# set GOMAXPROCS to number of processors
ExecStart=/bin/bash -c "GOMAXPROCS=$(nproc) /usr/bin/etcd --name=\"${ETCD_NAME}\" --data-dir=\"${ETCD_DATA_DIR}\" --listen-client-urls=\"${ETCD_LISTEN_CLIENT_URLS}\""
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
和
cat /etc/etcd/etcd.conf
# [member]
ETCD_NAME=default
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
#ETCD_WAL_DIR=""
#ETCD_SNAPSHOT_COUNT="10000"
#ETCD_HEARTBEAT_INTERVAL="100"
#ETCD_ELECTION_TIMEOUT="1000"
#ETCD_LISTEN_PEER_URLS="http://localhost:2380"
ETCD_LISTEN_CLIENT_URLS="http://localhost:2379"
#ETCD_MAX_SNAPSHOTS="5"
#ETCD_MAX_WALS="5"
#ETCD_CORS=""
#
#[cluster]
#ETCD_INITIAL_ADVERTISE_PEER_URLS="http://localhost:2380"
# if you use different ETCD_NAME (e.g. test), set ETCD_INITIAL_CLUSTER value for this name, i.e. "test=http://..."
#ETCD_INITIAL_CLUSTER="default=http://localhost:2380"
#ETCD_INITIAL_CLUSTER_STATE="new"
#ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_ADVERTISE_CLIENT_URLS="http://localhost:2379"
#ETCD_DISCOVERY=""
#ETCD_DISCOVERY_SRV=""
#ETCD_DISCOVERY_FALLBACK="proxy"
#ETCD_DISCOVERY_PROXY=""
#ETCD_STRICT_RECONFIG_CHECK="false"
#
#[proxy]
#ETCD_PROXY="off"
#ETCD_PROXY_FAILURE_WAIT="5000"
#ETCD_PROXY_REFRESH_INTERVAL="30000"
#ETCD_PROXY_DIAL_TIMEOUT="1000"
#ETCD_PROXY_WRITE_TIMEOUT="5000"
#ETCD_PROXY_READ_TIMEOUT="0"
#
#[security]
#ETCD_CERT_FILE=""
#ETCD_KEY_FILE=""
#ETCD_CLIENT_CERT_AUTH="false"
#ETCD_TRUSTED_CA_FILE=""
#ETCD_PEER_CERT_FILE=""
#ETCD_PEER_KEY_FILE=""
#ETCD_PEER_CLIENT_CERT_AUTH="false"
#ETCD_PEER_TRUSTED_CA_FILE=""
#
#[logging]
#ETCD_DEBUG="false"
# examples for -log-package-levels etcdserver=WARNING,security=DEBUG
#ETCD_LOG_PACKAGE_LEVELS=""
#
#[profiling]
#ETCD_ENABLE_PPROF="false"
那么......发生了什么? etcd
给出的错误消息是Go内置函数产生的典型无意义废话。 etcd
使用的HTTP服务器再次是Go内置垃圾,它产生非标准且绝对毫无价值的回复。所以我无法理解是什么(如果有的话)配置错误/缺失。