我们正在运行3节点群集,数据在AWS上的4.2.0.4 CE版本上。我们最近注意到没有进行写操作,但发现失败了。理想情况下,应该进行写入。一旦我们启动了发生故障的节点,就恢复了写操作。我们正在从AWS外部访问Aerospike集群。
在INFO日志下方找到,该日志在两个节点上连续打印。
INFO (hb): (hb.c:4319) found redundant connections to same node, fds 101 31 - choosing at random
在另一个节点上,asadm统计信息上没有打印日志,也没有读写。 此外,我们还观察到记录在节点之间分布不均。
下面是所有服务器上一致的配置文件网络部分。
所有3台服务器的网络节都是一致的。请在下面找到。
network {
service {
address any
port 3000
}
heartbeat {
mode mesh
port 3002 # Heartbeat port for this node.
# List one or more other nodes, one ip-address & port per line:
mesh-seed-address-port 13.xxx.xxx.xxx 3002
mesh-seed-address-port 13.xxx.xxx.xxx 3002
mesh-seed-address-port 13.xxx.xxx.xxx 3002
interval 150
timeout 10
}
fabric {
port 3001
}
info {
port 3003
}
}
namespace smpa {
replication-factor 2
memory-size 12G
storage-engine memory
single-bin true
high-water-memory-pct 80
stop-writes-pct 90
}
$ asadm -e“显示统计信息,如stop_writes”
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Service Statistics (2019-01-24 12:24:42 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE : node5.domain.com:3000 node6.domain.com:3000 node7.domain.com:3000
cluster_clock_skew_stop_writes_sec: 0 0 0
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~smpa Namespace Statistics (2019-01-24 12:24:42 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE : node5.domain.com:3000 node6.domain.com:3000 node7.domain.com:3000
clock_skew_stop_writes: false false false
stop_writes : false false false
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~test Namespace Statistics (2019-01-24 12:24:42 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE : node5.domain.com:3000 node6.domain.com:3000 node7.domain.com:3000
clock_skew_stop_writes: false false false
stop_writes : false false false
$ asadm -e“将统计信息显示为x_partitions”
Seed: [('127.0.0.1', 3000, None)]
Config_file: /home/web/.aerospike/astools.conf, /etc/aerospike/astools.conf
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~smpa Namespace Statistics (2019-01-24 12:30:01 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE : node5.domain.com:3000 node6.domain.com:3000 node7.domain.com:3000
migrate_rx_partitions_active : 0 0 0
migrate_rx_partitions_initial : 0 2749 0
migrate_rx_partitions_remaining: 0 0 0
migrate_tx_partitions_active : 0 0 0
migrate_tx_partitions_imbalance: 0 0 0
migrate_tx_partitions_initial : 1396 0 1353
migrate_tx_partitions_remaining: 0 0 0
$ asadm -e“显示pmap”
Seed: [('127.0.0.1', 3000, None)]
Config_file: /home/web/.aerospike/astools.conf, /etc/aerospike/astools.conf
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Partition Map Analysis (2019-01-24 12:33:39 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Cluster Namespace Node Primary Secondary Dead Unavailable
Key . . Partitions Partitions Partitions Partitions
BEF4A1479187 smpa node6.domain.com:3000 1382 1367 0 0
BEF4A1479187 smpa node7.domain.com:3000 1358 1342 0 0
BEF4A1479187 smpa node5.domain.com:3000 1356 1387 0 0
BEF4A1479187 test node6.domain.com:3000 1382 0 0 0
BEF4A1479187 test node7.domain.com:3000 1358 0 0 0
BEF4A1479187 test node5.domain.com:3000 1356 0 0 0
Number of rows: 6
$ asadm -e“显示类似对象的统计信息”
Seed: [('127.0.0.1', 3000, None)]
Config_file: /home/web/.aerospike/astools.conf, /etc/aerospike/astools.conf
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Service Statistics (2019-01-24 12:34:09 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE : node5.domain.com:3000 node6.domain.com:3000 node7.domain.com:3000
objects : 6478039 6485049 9265180
sindex_gc_objects_validated: 0 0 0
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~smpa Namespace Statistics (2019-01-24 12:34:09 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE : node5.domain.com:3000 node6.domain.com:3000 node7.domain.com:3000
evicted_objects : 0 0 0
expired_objects : 0 0 0
master_objects : 2944752 3456686 4712696
non_expirable_objects: 2943325 3455765 4711880
non_replica_objects : 0 0 0
objects : 6478039 6485049 9265180
prole_objects : 3533287 3028363 4552484
$ asadm -e“信息”
Seed: [('127.0.0.1', 3000, None)]
Config_file: /home/web/.aerospike/astools.conf, /etc/aerospike/astools.conf
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Network Information (2019-01-25 06:54:14 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Node Node Ip Build Cluster Migrations Cluster Cluster Principal Client Uptime
. Id . . Size . Key Integrity . Conns .
ec2-xx-xxx-xxx-xxx.ap-south-1.compute.amazonaws.com:3000 BB9BE0093E32B0A xx.xxx.xxx.xxx:3000 C-4.2.0.4 3 0.000 3ADA511969DD True BB9EAC87115AD0A 59 01:09:24
ec2-xx-xxx-xxx-xxx.ap-south-1.compute.amazonaws.com:3000 *BB9EAC87115AD0A xx.xxx.xxx.xxx:3000 C-4.2.0.4 3 0.000 3ADA511969DD True BB9EAC87115AD0A 59 01:05:17
ec2-xx-xxx-xxx-xxx.ap-south-1.compute.amazonaws.com:3000 BB9D4175485B10A xx.xxx.xxx.xxx:3000 C-4.2.0.4 3 0.000 3ADA511969DD True BB9EAC87115AD0A 59 01:14:17
Number of rows: 3
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Namespace Usage Information (2019-01-25 06:54:14 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Namespace Node Total Expirations,Evictions Stop Disk Disk HWM Avail% Mem Mem HWM Stop
. . Records . Writes Used Used% Disk% . Used Used% Mem% Writes%
smpa ec2-xx-xxx-xxx-xxx.ap-south-1.compute.amazonaws.com:3000 2.716 M (0.000, 0.000) false N/E N/E 50 N/E 2.774 GB 24 80 90
smpa ec2-xx-xxx-xxx-xxx.ap-south-1.compute.amazonaws.com:3000 2.648 M (0.000, 0.000) false N/E N/E 50 N/E 2.706 GB 23 80 90
smpa ec2-xx-xxx-xxx-xxx.ap-south-1.compute.amazonaws.com:3000 2.709 M (0.000, 0.000) false N/E N/E 50 N/E 2.767 GB 24 80 90
smpa 8.074 M (0.000, 0.000) 0.000 B 8.247 GB
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Namespace Object Information (2019-01-25 06:54:14 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Namespace Node Total Repl Objects Tombstones Pending Rack
. . Records Factor (Master,Prole,Non-Replica) (Master,Prole,Non-Replica) Migrates ID
. . . . . . (tx,rx) .
smpa ec2-xx-xxx-xxx-xxx.ap-south-1.compute.amazonaws.com:3000 2.716 M 2 (1.375 M, 1.341 M, 0.000) (0.000, 0.000, 0.000) (0.000, 0.000) 0
smpa ec2-xx-xxx-xxx-xxx.ap-south-1.compute.amazonaws.com:3000 2.648 M 2 (1.311 M, 1.337 M, 0.000) (0.000, 0.000, 0.000) (0.000, 0.000) 0
smpa ec2-xx-xxx-xxx-xxx.ap-south-1.compute.amazonaws.com:3000 2.709 M 2 (1.351 M, 1.359 M, 0.000) (0.000, 0.000, 0.000) (0.000, 0.000) 0
smpa 8.074 M (4.037 M, 4.037 M, 0.000) (0.000, 0.000, 0.000) (0.000, 0.000)
$ asadm -e“显示类似对象的统计信息”
Seed: [('127.0.0.1', 3000, None)]
Config_file: /home/web/.aerospike/astools.conf, /etc/aerospike/astools.conf
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~smpa d190122 Set Statistics (2019-01-25 07:07:30 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE : ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000 ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000 ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000
objects: 672400 662491 671131
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~smpa d190121 Set Statistics (2019-01-25 07:07:30 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE : ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000 ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000 ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000
objects: 376064 347232 374700
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~smpa d190124 Set Statistics (2019-01-25 07:07:30 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE : ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000 ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000 ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000
objects: 629323 617983 628214
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~smpa d190123 Set Statistics (2019-01-25 07:07:30 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE : ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000 ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000 ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000
objects: 739556 726447 736871
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~smpa d190125 Set Statistics (2019-01-25 07:07:30 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE : ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000 ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000 ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000
objects: 313800 308814 313320
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Service Statistics (2019-01-25 07:07:30 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE : ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000 ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000 ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000
objects : 2731143 2662967 2724236
sindex_gc_objects_validated: 0 0 0
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~smpa Namespace Statistics (2019-01-25 07:07:30 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE : ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000 ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000 ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000
evicted_objects : 0 0 0
expired_objects : 0 0 0
master_objects : 1382413 1318579 1358181
non_expirable_objects: 1382525 1318691 1358445
non_replica_objects : 0 0 0
objects : 2731143 2662967 2724236
prole_objects : 1348730 1344388 1366055
答案 0 :(得分:3)
检查其他两个节点是否正在发布客户端无法访问的私有IP地址,并且只有一个节点(发生故障)正在发布可访问的IP地址。 (网络节,服务子上下文)
答案 1 :(得分:3)
问题是,我提供了用于心跳通信的NATed ip。理想情况下,如果您的客户端不在网络中,我们需要为“ mesh-seed-address-port”提供专用IP,并为NATed IP提供“访问地址”。如果需要,请通过上述线程。
此处是有关如何在AWS EC2实例上进行配置的清晰文档。 https://discuss.aerospike.com/t/aws-ec2-ip-addressing-for-aerospike/2424
非常感谢kporter,pgupta和ashish-shinde的宝贵帮助。