无法将节点添加到蟑螂群集

时间:2019-05-11 01:55:41

标签: cockroachdb

我打算将CockroachDB节点加入集群。 我已经创建了第一个集群,然后尝试将第2个节点加入第一个节点,但是第2个节点如下创建了新集群。 有人知道我的后续步骤有什么错误步骤吗,有什么建议吗?

  1. 我已经按如下所示启动了第一个节点:
cockroach start --insecure --advertise-host=163.172.156.111
* Check out how to secure your cluster: https://www.cockroachlabs.com/docs/v19.1/secure-a-cluster.html
*
CockroachDB node starting at 2019-05-11 01:11:15.45522036 +0000 UTC (took 2.5s)
build:               CCL v19.1.0 @ 2019/04/29 18:36:40 (go1.11.6)
webui:               http://163.172.156.111:8080
sql:                 postgresql://root@163.172.156.111:26257?sslmode=disable
client flags:        cockroach <client cmd> --host=163.172.156.111:26257 --insecure
logs:                /home/ueda/cockroach-data/logs
temp dir:            /home/ueda/cockroach-data/cockroach-temp449555924
external I/O path:   /home/ueda/cockroach-data/extern
store[0]:            path=/home/ueda/cockroach-data
status:              initialized new cluster
clusterID:           3e797faa-59a1-4b0d-83b5-36143ddbdd69
nodeID:              1
  1. 然后,启动辅助节点以加入163.172.156.111,但无法加入:
cockroach start --insecure --advertise-addr=128.199.127.164 --join=163.172.156.111:26257
CockroachDB node starting at 2019-05-11 01:21:14.533097432 +0000 UTC (took 0.8s)
build:               CCL v19.1.0 @ 2019/04/29 18:36:40 (go1.11.6)
webui:               http://128.199.127.164:8080
sql:                 postgresql://root@128.199.127.164:26257?sslmode=disable
client flags:        cockroach <client cmd> --host=128.199.127.164:26257 --insecure
logs:                /home/ueda/cockroach-data/logs
temp dir:            /home/ueda/cockroach-data/cockroach-temp067740997
external I/O path:   /home/ueda/cockroach-data/extern
store[0]:            path=/home/ueda/cockroach-data
status:              restarted pre-existing node
clusterID:           a14e89a7-792d-44d3-89af-7037442eacbc
nodeID:              1

加入节点的cockroach.log显示一些八卦错误:

cat cockroach-data/logs/cockroach.log 
I190511 01:21:13.762309 1 util/log/clog.go:1199  [config] file created at: 2019/05/11 01:21:13
I190511 01:21:13.762309 1 util/log/clog.go:1199  [config] running on machine: amfortas
I190511 01:21:13.762309 1 util/log/clog.go:1199  [config] binary: CockroachDB CCL v19.1.0 (x86_64-unknown-linux-gnu, built 2019/04/29 18:36:40, go1.11.6)
I190511 01:21:13.762309 1 util/log/clog.go:1199  [config] arguments: [cockroach start --insecure --advertise-addr=128.199.127.164 --join=163.172.156.111:26257]
I190511 01:21:13.762309 1 util/log/clog.go:1199  line format: [IWEF]yymmdd hh:mm:ss.uuuuuu goid file:line msg utf8=✓
I190511 01:21:13.762307 1 cli/start.go:1033  logging to directory /home/ueda/cockroach-data/logs
W190511 01:21:13.763373 1 cli/start.go:1068  RUNNING IN INSECURE MODE!

- Your cluster is open for any client that can access <all your IP addresses>.
- Any user, even root, can log in without providing a password.
- Any user, connecting as root, can read or write any data in your cluster.
- There is no network encryption nor authentication, and thus no confidentiality.

Check out how to secure your cluster: https://www.cockroachlabs.com/docs/v19.1/secure-a-cluster.html
I190511 01:21:13.763675 1 server/status/recorder.go:610  available memory from cgroups (8.0 EiB) exceeds system memory 992 MiB, using system memory
W190511 01:21:13.763752 1 cli/start.go:944  Using the default setting for --cache (128 MiB).
  A significantly larger value is usually needed for good performance.
  If you have a dedicated server a reasonable setting is --cache=.25 (248 MiB).
I190511 01:21:13.764011 1 server/status/recorder.go:610  available memory from cgroups (8.0 EiB) exceeds system memory 992 MiB, using system memory
W190511 01:21:13.764047 1 cli/start.go:957  Using the default setting for --max-sql-memory (128 MiB).
  A significantly larger value is usually needed in production.
  If you have a dedicated server a reasonable setting is --max-sql-memory=.25 (248 MiB).
I190511 01:21:13.764239 1 server/status/recorder.go:610  available memory from cgroups (8.0 EiB) exceeds system memory 992 MiB, using system memory
I190511 01:21:13.764272 1 cli/start.go:1082  CockroachDB CCL v19.1.0 (x86_64-unknown-linux-gnu, built 2019/04/29 18:36:40, go1.11.6)
I190511 01:21:13.866977 1 server/status/recorder.go:610  available memory from cgroups (8.0 EiB) exceeds system memory 992 MiB, using system memory
I190511 01:21:13.867002 1 server/config.go:386  system total memory: 992 MiB
I190511 01:21:13.867063 1 server/config.go:388  server configuration:
max offset             500000000
cache size             128 MiB
SQL memory pool size   128 MiB
scan interval          10m0s
scan min idle time     10ms
scan max idle time     1s
event log enabled      true
I190511 01:21:13.867098 1 cli/start.go:929  process identity: uid 1000 euid 1000 gid 1000 egid 1000
I190511 01:21:13.867115 1 cli/start.go:554  starting cockroach node
I190511 01:21:13.868242 21 storage/engine/rocksdb.go:613  opening rocksdb instance at "/home/ueda/cockroach-data/cockroach-temp067740997"
I190511 01:21:13.894320 21 server/server.go:876  [n?] monitoring forward clock jumps based on server.clock.forward_jump_check_enabled
I190511 01:21:13.894813 21 storage/engine/rocksdb.go:613  opening rocksdb instance at "/home/ueda/cockroach-data"
W190511 01:21:13.896301 21 storage/engine/rocksdb.go:127  [rocksdb] [/go/src/github.com/cockroachdb/cockroach/c-deps/rocksdb/db/version_set.cc:2566] More existing levels in DB than needed. max_bytes_for_level_multiplier may not be guaranteed.
W190511 01:21:13.905666 21 storage/engine/rocksdb.go:127  [rocksdb] [/go/src/github.com/cockroachdb/cockroach/c-deps/rocksdb/db/version_set.cc:2566] More existing levels in DB than needed. max_bytes_for_level_multiplier may not be guaranteed.
I190511 01:21:13.911380 21 server/config.go:494  [n?] 1 storage engine initialized
I190511 01:21:13.911417 21 server/config.go:497  [n?] RocksDB cache size: 128 MiB
I190511 01:21:13.911427 21 server/config.go:497  [n?] store 0: RocksDB, max size 0 B, max open file limit 10000
W190511 01:21:13.912459 21 gossip/gossip.go:1496  [n?] no incoming or outgoing connections
I190511 01:21:13.913206 21 server/server.go:926  [n?] Sleeping till wall time 1557537673913178595 to catches up to 1557537674394265598 to ensure monotonicity. Delta: 481.087003ms
I190511 01:21:14.251655 65 vendor/github.com/cockroachdb/circuitbreaker/circuitbreaker.go:322  [n?] circuitbreaker: gossip [::]:26257->163.172.156.111:26257 tripped: initial connection heartbeat failed: rpc error: code = Unknown desc = client cluster ID "a14e89a7-792d-44d3-89af-7037442eacbc" doesn't match server cluster ID "3e797faa-59a1-4b0d-83b5-36143ddbdd69"
I190511 01:21:14.251695 65 vendor/github.com/cockroachdb/circuitbreaker/circuitbreaker.go:447  [n?] circuitbreaker: gossip [::]:26257->163.172.156.111:26257 event: BreakerTripped
W190511 01:21:14.251763 65 gossip/client.go:122  [n?] failed to start gossip client to 163.172.156.111:26257: initial connection heartbeat failed: rpc error: code = Unknown desc = client cluster ID "a14e89a7-792d-44d3-89af-7037442eacbc" doesn't match server cluster ID "3e797faa-59a1-4b0d-83b5-36143ddbdd69"
I190511 01:21:14.395848 21 gossip/gossip.go:392  [n1] NodeDescriptor set to node_id:1 address:<network_field:"tcp" address_field:"128.199.127.164:26257" > attrs:<> locality:<> ServerVersion:<major_val:19 minor_val:1 patch:0 unstable:0 > build_tag:"v19.1.0" started_at:1557537674395557548 
W190511 01:21:14.458176 21 storage/replica_range_lease.go:506  can't determine lease status due to node liveness error: node not in the liveness table
I190511 01:21:14.458465 21 server/node.go:461  [n1] initialized store [n1,s1]: disk (capacity=24 GiB, available=18 GiB, used=2.2 MiB, logicalBytes=41 MiB), ranges=20, leases=0, queries=0.00, writes=0.00, bytesPerReplica={p10=0.00 p25=0.00 p50=0.00 p75=6467.00 p90=26940.00 pMax=43017435.00}, writesPerReplica={p10=0.00 p25=0.00 p50=0.00 p75=0.00 p90=0.00 pMax=0.00}
I190511 01:21:14.458775 21 storage/stores.go:244  [n1] read 0 node addresses from persistent storage
I190511 01:21:14.459095 21 server/node.go:699  [n1] connecting to gossip network to verify cluster ID...
W190511 01:21:14.469842 96 storage/store.go:1525  [n1,s1,r6/1:/Table/{SystemCon…-11}] could not gossip system config: [NotLeaseHolderError] r6: replica (n1,s1):1 not lease holder; lease holder unknown
I190511 01:21:14.474785 21 server/node.go:719  [n1] node connected via gossip and verified as part of cluster "a14e89a7-792d-44d3-89af-7037442eacbc"
I190511 01:21:14.475033 21 server/node.go:542  [n1] node=1: started with [<no-attributes>=/home/ueda/cockroach-data] engine(s) and attributes []
I190511 01:21:14.475393 21 server/status/recorder.go:610  [n1] available memory from cgroups (8.0 EiB) exceeds system memory 992 MiB, using system memory
I190511 01:21:14.475514 21 server/server.go:1582  [n1] starting http server at [::]:8080 (use: 128.199.127.164:8080)
I190511 01:21:14.475572 21 server/server.go:1584  [n1] starting grpc/postgres server at [::]:26257
I190511 01:21:14.475605 21 server/server.go:1585  [n1] advertising CockroachDB node at 128.199.127.164:26257
W190511 01:21:14.475655 21 jobs/registry.go:341  [n1] unable to get node liveness: node not in the liveness table
I190511 01:21:14.532949 21 server/server.go:1650  [n1] done ensuring all necessary migrations have run
I190511 01:21:14.533020 21 server/server.go:1653  [n1] serving sql connections
I190511 01:21:14.533209 21 cli/start.go:689  [config] clusterID: a14e89a7-792d-44d3-89af-7037442eacbc
I190511 01:21:14.533257 21 cli/start.go:697  node startup completed:
CockroachDB node starting at 2019-05-11 01:21:14.533097432 +0000 UTC (took 0.8s)
build:               CCL v19.1.0 @ 2019/04/29 18:36:40 (go1.11.6)
webui:               http://128.199.127.164:8080
sql:                 postgresql://root@128.199.127.164:26257?sslmode=disable
client flags:        cockroach <client cmd> --host=128.199.127.164:26257 --insecure
logs:                /home/ueda/cockroach-data/logs
temp dir:            /home/ueda/cockroach-data/cockroach-temp067740997
external I/O path:   /home/ueda/cockroach-data/extern
store[0]:            path=/home/ueda/cockroach-data
status:              restarted pre-existing node
clusterID:           a14e89a7-792d-44d3-89af-7037442eacbc
nodeID:              1
I190511 01:21:14.541205 146 server/server_update.go:67  [n1] no need to upgrade, cluster already at the newest version
I190511 01:21:14.555557 149 sql/event_log.go:135  [n1] Event: "node_restart", target: 1, info: {Descriptor:{NodeID:1 Address:128.199.127.164:26257 Attrs: Locality: ServerVersion:19.1 BuildTag:v19.1.0 StartedAt:1557537674395557548 LocalityAddress:[] XXX_NoUnkeyedLiteral:{} XXX_sizecache:0} ClusterID:a14e89a7-792d-44d3-89af-7037442eacbc StartedAt:1557537674395557548 LastUp:1557537671113461486}
I190511 01:21:14.916458 59 gossip/gossip.go:1510  [n1] node has connected to cluster via gossip
I190511 01:21:14.916660 59 storage/stores.go:263  [n1] wrote 0 node addresses to persistent storage
I190511 01:21:24.480247 116 storage/store.go:4220  [n1,s1] sstables (read amplification = 2):
0 [ 51K 1 ]: 51K
6 [  1M 1 ]: 1M
I190511 01:21:24.480380 116 storage/store.go:4221  [n1,s1] 
** Compaction Stats [default] **
Level    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop
----------------------------------------------------------------------------------------------------------------------------------------------------------
  L0      1/0   50.73 KB   0.5      0.0     0.0      0.0       0.0      0.0       0.0   1.0      0.0      8.0         0         1    0.006       0      0
  L6      1/0    1.26 MB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0      0.0         0         0    0.000       0      0
 Sum      2/0    1.31 MB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   1.0      0.0      8.0         0         1    0.006       0      0
 Int      0/0    0.00 KB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   1.0      0.0      8.0         0         1    0.006       0      0
Uptime(secs): 10.6 total, 10.6 interval
Flush(GB): cumulative 0.000, interval 0.000
AddFile(GB): cumulative 0.000, interval 0.000
AddFile(Total Files): cumulative 0, interval 0
AddFile(L0 Files): cumulative 0, interval 0
AddFile(Keys): cumulative 0, interval 0
Cumulative compaction: 0.00 GB write, 0.00 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.0 seconds
Interval compaction: 0.00 GB write, 0.00 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.0 seconds
Stalls(count): 0 level0_slowdown, 0 level0_slowdown_with_compaction, 0 level0_numfiles, 0 level0_numfiles_with_compaction, 0 stop for pending_compaction_bytes, 0 slowdown for pending_compaction_bytes, 0 memtable_compaction, 0 memtable_slowdown, interval 0 total count
estimated_pending_compaction_bytes: 0 B
I190511 01:21:24.481565 121 server/status/runtime.go:500  [n1] runtime stats: 170 MiB RSS, 114 goroutines, 0 B/0 B/0 B GO alloc/idle/total, 14 MiB/16 MiB CGO alloc/total, 0.0 CGO/sec, 0.0/0.0 %(u/s)time, 0.0 %gc (7x), 50 KiB/1.5 MiB (r/w)net

阻止加入的可能原因是什么?谢谢您的建议!

1 个答案:

答案 0 :(得分:1)

看来您以前是自己启动了第二个节点(在128.199.127.164上运行的那个),并创建了自己的集群。

这可以在错误消息中看到:

W190511 01:21:14.251763 65 gossip/client.go:122  [n?] failed to start gossip client to 163.172.156.111:26257: initial connection heartbeat failed: rpc error: code = Unknown desc = client cluster ID "a14e89a7-792d-44d3-89af-7037442eacbc" doesn't match server cluster ID "3e797faa-59a1-4b0d-83b5-36143ddbdd69"

要能够加入集群,加入节点的数据目录必须为空。您可以删除cockroach-data或使用--store=/path/to/data-dir指定一个备用目录