Aerospike:三个节点之一突然掉线,写入未发生

时间:2019-01-23 11:28:33

标签: aerospike aerospike-ce

我们正在运行3节点群集,数据在AWS上的4.2.0.4 CE版本上。我们最近注意到没有进行写操作,但发现失败了。理想情况下,应该进行写入。一旦我们启动了发生故障的节点,就恢复了写操作。我们正在从AWS外部访问Aerospike集群。

在INFO日志下方找到,该日志在两个节点上连续打印。

INFO (hb): (hb.c:4319) found redundant connections to same node, fds 101 31 - choosing at random

在另一个节点上,asadm统计信息上没有打印日志,也没有读写。 此外,我们还观察到记录在节点之间分布不均。

下面是所有服务器上一致的配置文件网络部分。

所有3台服务器的网络节都是一致的。请在下面找到。

network {
    service {
            address any
            port 3000
    }

    heartbeat {

            mode mesh
            port 3002 # Heartbeat port for this node.

            # List one or more other nodes, one ip-address & port per line:
            mesh-seed-address-port 13.xxx.xxx.xxx 3002
            mesh-seed-address-port 13.xxx.xxx.xxx 3002
            mesh-seed-address-port 13.xxx.xxx.xxx 3002

            interval 150
            timeout 10
    }

    fabric {
            port 3001
    }

    info {
            port 3003
    }
}
namespace smpa {
    replication-factor 2
    memory-size 12G
    storage-engine memory
    single-bin true
    high-water-memory-pct 80
    stop-writes-pct 90
}

$ asadm -e“显示统计信息,如stop_writes”

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Service Statistics (2019-01-24 12:24:42 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE                              :   node5.domain.com:3000   node6.domain.com:3000   node7.domain.com:3000   
cluster_clock_skew_stop_writes_sec:   0                               0                               0                               

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~smpa Namespace Statistics (2019-01-24 12:24:42 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE                  :   node5.domain.com:3000   node6.domain.com:3000   node7.domain.com:3000   
clock_skew_stop_writes:   false                           false                           false                           
stop_writes           :   false                           false                           false                           

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~test Namespace Statistics (2019-01-24 12:24:42 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE                  :   node5.domain.com:3000   node6.domain.com:3000   node7.domain.com:3000   
clock_skew_stop_writes:   false                           false                           false                           
stop_writes           :   false                           false                           false   

$ asadm -e“将统计信息显示为x_partitions”

Seed:        [('127.0.0.1', 3000, None)]
Config_file: /home/web/.aerospike/astools.conf, /etc/aerospike/astools.conf
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~smpa Namespace Statistics (2019-01-24 12:30:01 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE                           :   node5.domain.com:3000   node6.domain.com:3000   node7.domain.com:3000   
migrate_rx_partitions_active   :   0                               0                               0                               
migrate_rx_partitions_initial  :   0                               2749                            0                               
migrate_rx_partitions_remaining:   0                               0                               0                               
migrate_tx_partitions_active   :   0                               0                               0                               
migrate_tx_partitions_imbalance:   0                               0                               0                               
migrate_tx_partitions_initial  :   1396                            0                               1353                            
migrate_tx_partitions_remaining:   0                               0                               0

$ asadm -e“显示pmap”

Seed:        [('127.0.0.1', 3000, None)]
Config_file: /home/web/.aerospike/astools.conf, /etc/aerospike/astools.conf
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Partition Map Analysis (2019-01-24 12:33:39 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     Cluster   Namespace                            Node      Primary    Secondary         Dead   Unavailable   
         Key           .                               .   Partitions   Partitions   Partitions    Partitions   
BEF4A1479187   smpa        node6.domain.com:3000         1382         1367            0             0   
BEF4A1479187   smpa        node7.domain.com:3000         1358         1342            0             0   
BEF4A1479187   smpa        node5.domain.com:3000         1356         1387            0             0   
BEF4A1479187   test        node6.domain.com:3000         1382            0            0             0   
BEF4A1479187   test        node7.domain.com:3000         1358            0            0             0   
BEF4A1479187   test        node5.domain.com:3000         1356            0            0             0   
Number of rows: 6

$ asadm -e“显示类似对象的统计信息”

Seed:        [('127.0.0.1', 3000, None)]
Config_file: /home/web/.aerospike/astools.conf, /etc/aerospike/astools.conf

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Service Statistics (2019-01-24 12:34:09 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE                       :   node5.domain.com:3000   node6.domain.com:3000   node7.domain.com:3000   
objects                    :   6478039                         6485049                         9265180                         
sindex_gc_objects_validated:   0                               0                               0                               

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~smpa Namespace Statistics (2019-01-24 12:34:09 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE                 :   node5.domain.com:3000   node6.domain.com:3000   node7.domain.com:3000   
evicted_objects      :   0                               0                               0                               
expired_objects      :   0                               0                               0                               
master_objects       :   2944752                         3456686                         4712696                         
non_expirable_objects:   2943325                         3455765                         4711880                         
non_replica_objects  :   0                               0                               0                               
objects              :   6478039                         6485049                         9265180                         
prole_objects        :   3533287                         3028363                         4552484                         

$ asadm -e“信息”

Seed:        [('127.0.0.1', 3000, None)]
Config_file: /home/web/.aerospike/astools.conf, /etc/aerospike/astools.conf
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Network Information (2019-01-25 06:54:14 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                                    Node               Node                    Ip       Build   Cluster   Migrations        Cluster     Cluster         Principal   Client     Uptime   
                                                       .                 Id                     .           .      Size            .            Key   Integrity                 .    Conns          .   
ec2-xx-xxx-xxx-xxx.ap-south-1.compute.amazonaws.com:3000   BB9BE0093E32B0A    xx.xxx.xxx.xxx:3000   C-4.2.0.4         3      0.000     3ADA511969DD   True        BB9EAC87115AD0A       59   01:09:24   
ec2-xx-xxx-xxx-xxx.ap-south-1.compute.amazonaws.com:3000   *BB9EAC87115AD0A   xx.xxx.xxx.xxx:3000   C-4.2.0.4         3      0.000     3ADA511969DD   True        BB9EAC87115AD0A       59   01:05:17   
ec2-xx-xxx-xxx-xxx.ap-south-1.compute.amazonaws.com:3000   BB9D4175485B10A    xx.xxx.xxx.xxx:3000   C-4.2.0.4         3      0.000     3ADA511969DD   True        BB9EAC87115AD0A       59   01:14:17   
Number of rows: 3

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Namespace Usage Information (2019-01-25 06:54:14 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Namespace                                                       Node     Total   Expirations,Evictions     Stop       Disk    Disk     HWM   Avail%        Mem     Mem    HWM      Stop   
        .                                                          .   Records                       .   Writes       Used   Used%   Disk%        .       Used   Used%   Mem%   Writes%   
smpa        ec2-xx-xxx-xxx-xxx.ap-south-1.compute.amazonaws.com:3000   2.716 M   (0.000,  0.000)         false         N/E   N/E     50      N/E      2.774 GB   24      80     90        
smpa        ec2-xx-xxx-xxx-xxx.ap-south-1.compute.amazonaws.com:3000   2.648 M   (0.000,  0.000)         false         N/E   N/E     50      N/E      2.706 GB   23      80     90        
smpa        ec2-xx-xxx-xxx-xxx.ap-south-1.compute.amazonaws.com:3000   2.709 M   (0.000,  0.000)         false         N/E   N/E     50      N/E      2.767 GB   24      80     90        
smpa                                                                   8.074 M   (0.000,  0.000)                  0.000 B                             8.247 GB                            
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Namespace Object Information (2019-01-25 06:54:14 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Namespace                                                       Node     Total     Repl                       Objects                   Tombstones             Pending   Rack   
        .                                                          .   Records   Factor    (Master,Prole,Non-Replica)   (Master,Prole,Non-Replica)            Migrates     ID   
        .                                                          .         .        .                             .                            .             (tx,rx)      .   
smpa        ec2-xx-xxx-xxx-xxx.ap-south-1.compute.amazonaws.com:3000   2.716 M   2        (1.375 M, 1.341 M, 0.000)     (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
smpa        ec2-xx-xxx-xxx-xxx.ap-south-1.compute.amazonaws.com:3000   2.648 M   2        (1.311 M, 1.337 M, 0.000)     (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
smpa        ec2-xx-xxx-xxx-xxx.ap-south-1.compute.amazonaws.com:3000   2.709 M   2        (1.351 M, 1.359 M, 0.000)     (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
smpa                                                                   8.074 M            (4.037 M, 4.037 M, 0.000)     (0.000,  0.000,  0.000)      (0.000,  0.000)            

$ asadm -e“显示类似对象的统计信息”

Seed:        [('127.0.0.1', 3000, None)]
Config_file: /home/web/.aerospike/astools.conf, /etc/aerospike/astools.conf
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~smpa d190122 Set Statistics (2019-01-25 07:07:30 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE   :   ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000   ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000   ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000   
objects:   672400                                                     662491                                                     671131                                                     

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~smpa d190121 Set Statistics (2019-01-25 07:07:30 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE   :   ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000   ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000   ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000   
objects:   376064                                                     347232                                                     374700                                                     

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~smpa d190124 Set Statistics (2019-01-25 07:07:30 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE   :   ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000   ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000   ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000   
objects:   629323                                                     617983                                                     628214                                                     

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~smpa d190123 Set Statistics (2019-01-25 07:07:30 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE   :   ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000   ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000   ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000   
objects:   739556                                                     726447                                                     736871                                                     

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~smpa d190125 Set Statistics (2019-01-25 07:07:30 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE   :   ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000   ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000   ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000   
objects:   313800                                                     308814                                                     313320                                                     

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Service Statistics (2019-01-25 07:07:30 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE                       :   ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000   ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000   ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000   
objects                    :   2731143                                                    2662967                                                    2724236                                                    
sindex_gc_objects_validated:   0                                                          0                                                          0                                                          

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~smpa Namespace Statistics (2019-01-25 07:07:30 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE                 :   ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000   ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000   ec2-xx.xxx.xxx.xxx.ap-south-1.compute.amazonaws.com:3000   
evicted_objects      :   0                                                          0                                                          0                                                          
expired_objects      :   0                                                          0                                                          0                                                          
master_objects       :   1382413                                                    1318579                                                    1358181                                                    
non_expirable_objects:   1382525                                                    1318691                                                    1358445                                                    
non_replica_objects  :   0                                                          0                                                          0                                                          
objects              :   2731143                                                    2662967                                                    2724236                                                    
prole_objects        :   1348730                                                    1344388                                                    1366055                                                    

2 个答案:

答案 0 :(得分:3)

检查其他两个节点是否正在发布客户端无法访问的私有IP地址,并且只有一个节点(发生故障)正在发布可访问的IP地址。 (网络节,服务子上下文)

答案 1 :(得分:3)

问题是,我提供了用于心跳通信的NATed ip。理想情况下,如果您的客户端不在网络中,我们需要为“ mesh-seed-address-port”提供专用IP,并为NATed IP提供“访问地址”。如果需要,请通过上述线程。

此处是有关如何在AWS EC2实例上进行配置的清晰文档。 https://discuss.aerospike.com/t/aws-ec2-ip-addressing-for-aerospike/2424

非常感谢kporter,pgupta和ashish-shinde的宝贵帮助。