几天前,Membase开始工作很慢。我们在每台服务器上使用本地moxi代理(我们有4台服务器)。 我们的Membase集群有2台服务器(10GBx2),可以复制。现在我们在集群上有大约15M的记录(30M带有副本)。在集群上加载大约2000-6000ops和100%内存驻留。
Membase:1.7.1 系统:Linux CentOS 2.6.18-238.19.1.el5#1 SMP Fri 7月15日07:31:24 EDT 2011 x86_64 x86_64 x86_64 GNU / Linux
以下是来自moxi proxy的时间:
stats proxy timings
STAT 11311:default:connect 100+100=23 37.70% ******************
STAT 11311:default:connect 200+100=30 86.89% ************************
STAT 11311:default:connect 300+100=7 98.36% *****
STAT 11311:default:connect 400+100=1 100.00%
STAT 11311:default:connect 500+100=0 100.00%
STAT 11311:default:reserved 100+100 =6848 1.64% *
STAT 11311:default:reserved 200+100 =41143 11.49% *******
STAT 11311:default:reserved 300+100 =102670 36.08% *******************
STAT 11311:default:reserved 400+100 =127986 66.73% ************************
STAT 11311:default:reserved 500+100 =53582 79.56% **********
STAT 11311:default:reserved 600+100 =31544 87.11% *****
STAT 11311:default:reserved 700+100 =13188 90.27% **
STAT 11311:default:reserved 800+100 =3630 91.14%
STAT 11311:default:reserved 900+100 =1629 91.53%
STAT 11311:default:reserved 1000+100 =1253 91.83%
STAT 11311:default:reserved 1100+100 =1135 92.10%
STAT 11311:default:reserved 1200+100 =967 92.33%
STAT 11311:default:reserved 1300+100 =553 92.46%
STAT 11311:default:reserved 1400+100 =354 92.55%
STAT 11311:default:reserved 1500+100 =265 92.61%
STAT 11311:default:reserved 1600+100 =226 92.67%
STAT 11311:default:reserved 1700+100 =252 92.73%
STAT 11311:default:reserved 1800+100 =197 92.77%
STAT 11311:default:reserved 1900+100 =132 92.81%
STAT 11311:default:reserved 2000+100 =73 92.82%
STAT 11311:default:reserved 2100+200 =242 92.88%
STAT 11311:default:reserved 2300+400 =307 92.96%
STAT 11311:default:reserved 2700+800 =256 93.02%
STAT 11311:default:reserved 3500+1600 =244 93.08%
STAT 11311:default:reserved 5100+3200 =241 93.13%
STAT 11311:default:reserved 8300+6400 =90 93.15%
STAT 11311:default:reserved 14700+12800 =13 93.16%
STAT 11311:default:reserved 27500+25600 =256 93.22%
STAT 11311:default:reserved 53100+51200 =5 93.22%
STAT 11311:default:reserved 104300+102400 =17107 97.32% ***
STAT 11311:default:reserved 206700+204800 =8553 99.36% *
STAT 11311:default:reserved 411500+409600 =2358 99.93%
STAT 11311:default:reserved 821100+819200 =255 99.99%
STAT 11311:default:reserved 1640300+1638400=37 100.00%
STAT 11311:default:reserved 3278700+3276800=3 100.00%
STAT 11311:default:reserved 6555500+6553600=0 100.00%
END
Membase统计数据:
STAT accepting_conns 1
STAT auth_cmds 174777
STAT auth_errors 0
STAT bucket_active_conns 1
STAT bucket_conns 304
STAT bytes_read 6765632401898
STAT bytes_written 5857686469455
STAT cas_badval 827124
STAT cas_hits 116425346
STAT cas_misses 97243268
STAT cmd_flush 0
STAT cmd_get 10962483669
STAT cmd_set 1406835206
STAT connection_structures 684
STAT conn_yields 2180210
STAT curr_connections 313
STAT curr_items 7636008
STAT curr_items_tot 15386636
STAT daemon_connections 10
STAT decr_hits 0
STAT decr_misses 0
STAT delete_hits 311792
STAT delete_misses 9
STAT ep_bg_fetched 0
STAT ep_commit_num 3320450
STAT ep_commit_time 0
STAT ep_commit_time_total 9492892
STAT ep_data_age 84
STAT ep_data_age_highwat 1398
STAT ep_db_cleaner_status complete
STAT ep_dbinit 2
STAT ep_dbname /data/membase/data/default-data/default
STAT ep_dbshards 4
STAT ep_db_strategy multiMTVBDB
STAT ep_diskqueue_drain 1647838030
STAT ep_diskqueue_fill 1576916461
STAT ep_diskqueue_items 9919
STAT ep_diskqueue_memory 872872
STAT ep_diskqueue_pending 13765304
STAT ep_expired 244175343
STAT ep_flush_all false
STAT ep_flush_duration 132
STAT ep_flush_duration_highwat 768
STAT ep_flush_duration_total 9517040
STAT ep_flusher_state running
STAT ep_flusher_todo 1457
STAT ep_flush_preempts 0
STAT ep_io_num_read 3570394
STAT ep_io_num_write 1430459297
STAT ep_io_read_bytes 647400207
STAT ep_io_write_bytes 3645986635220
STAT ep_item_begin_failed 0
STAT ep_item_commit_failed 0
STAT ep_item_flush_expired 195564028
STAT ep_item_flush_failed 0
STAT ep_items_rm_from_checkpoints 689034281
STAT ep_kv_size 5046455536
STAT ep_latency_arith_cmd 33757761
STAT ep_latency_get_cmd 18446744071823123379
STAT ep_latency_store_cmd 1655074486
STAT ep_max_data_size 10485760000
STAT ep_max_txn_size 1000
STAT ep_mem_high_wat 7864320000
STAT ep_mem_low_wat 6291456000
STAT ep_min_data_age 0
STAT ep_num_active_non_resident 0
STAT ep_num_checkpoint_remover_runs 2386690
STAT ep_num_eject_failures 0
STAT ep_num_eject_replicas 0
STAT ep_num_expiry_pager_runs 3316
STAT ep_num_non_resident 0
STAT ep_num_not_my_vbuckets 42220981
STAT ep_num_pager_runs 0
STAT ep_num_value_ejects 0
STAT ep_onlineupdate false
STAT ep_onlineupdate_revert_add 0
STAT ep_onlineupdate_revert_delete 0
STAT ep_onlineupdate_revert_update 0
STAT ep_oom_errors 0
STAT ep_overhead 123317215
STAT ep_pending_ops 0
STAT ep_pending_ops_max 0
STAT ep_pending_ops_max_duration 0
STAT ep_pending_ops_total 0
STAT ep_queue_age_cap 900
STAT ep_queue_size 8521
STAT ep_storage_age 84
STAT ep_storage_age_highwat 29808
STAT ep_storage_type featured
STAT ep_store_max_concurrency 10
STAT ep_store_max_readers 9
STAT ep_store_max_readwrite 1
STAT ep_tap_bg_fetched 0
STAT ep_tap_bg_fetch_requeued 0
STAT ep_tap_keepalive 300
STAT ep_tmp_oom_errors 0
STAT ep_too_old 1553154
STAT ep_too_young 0
STAT ep_total_cache_size 78231164348
STAT ep_total_del_items 196040440
STAT ep_total_enqueued 1576916461
STAT ep_total_new_items 208010695
STAT ep_total_persisted 1626499737
STAT ep_uncommitted_items 998
STAT ep_value_size 3107006691
STAT ep_vb_total 1024
STAT ep_vbucket_del 45
STAT ep_vbucket_del_avg_walltime 3567
STAT ep_vbucket_del_fail 0
STAT ep_vbucket_del_max_walltime 22153
STAT ep_vbucket_del_total_walltime 160519
STAT ep_version 1.7.1_2_g4648391
STAT ep_warmed_up 3569370
STAT ep_warmup_dups 0
STAT ep_warmup_oom 0
STAT ep_warmup_thread complete
STAT ep_warmup_time 1675261980
STAT ep_warmup true
STAT get_hits 7075301263
STAT get_misses 3887182406
STAT incr_hits 33740914
STAT incr_misses 14178
STAT libevent 2.0.11-stable
STAT limit_maxbytes 67108864
STAT listen_disabled_num 0
STAT mem_used 5169772751
STAT pid 4225
STAT pointer_size 64
STAT rejected_conns 0
STAT rusage_system 274452.680850
STAT rusage_user 527816.140750
STAT tap_checkpoint_end_received 10177375
STAT tap_checkpoint_end_sent 10176851
STAT tap_checkpoint_start_received 10177881
STAT tap_checkpoint_start_sent 10177361
STAT tap_connect_received 11827
STAT tap_delete_received 101873556
STAT tap_delete_sent 101820996
STAT tap_mutation_received 1182343906
STAT tap_mutation_sent 1194330191
STAT tap_opaque_received 35863
STAT tap_opaque_sent 34956
STAT threads 4
STAT time 1331297508
STAT total_connections 174956
STAT uptime 11938092
STAT vb_active_curr_items 7636008
STAT vb_active_eject 0
STAT vb_active_ht_memory 50561024
STAT vb_active_itm_memory 2194686368
STAT vb_active_num 512
STAT vb_active_num_non_resident 0
STAT vb_active_ops_create 104054142
STAT vb_active_ops_delete 98202339
STAT vb_active_ops_reject 0
STAT vb_active_ops_update 625882155
STAT vb_active_perc_mem_resident 100
STAT vb_active_queue_age 915152000
STAT vb_active_queue_drain 839960787
STAT vb_active_queue_fill 839965441
STAT vb_active_queue_memory 409552
STAT vb_active_queue_pending 5710163
STAT vb_active_queue_size 4654
STAT vb_dead_num 0
STAT vb_pending_curr_items 0
STAT vb_pending_eject 0
STAT vb_pending_ht_memory 0
STAT vb_pending_itm_memory 0
STAT vb_pending_num 0
STAT vb_pending_num_non_resident 0
STAT vb_pending_ops_create 0
STAT vb_pending_ops_delete 0
STAT vb_pending_ops_reject 0
STAT vb_pending_ops_update 0
STAT vb_pending_perc_mem_resident 0
STAT vb_pending_queue_age 0
STAT vb_pending_queue_drain 0
STAT vb_pending_queue_fill 0
STAT vb_pending_queue_memory 0
STAT vb_pending_queue_pending 0
STAT vb_pending_queue_size 0
STAT vb_replica_curr_items 7750628
STAT vb_replica_eject 0
STAT vb_replica_ht_memory 50561024
STAT vb_replica_itm_memory 2212229588
STAT vb_replica_num 512
STAT vb_replica_num_non_resident 0
STAT vb_replica_ops_create 103956553
STAT vb_replica_ops_delete 97838101
STAT vb_replica_ops_reject 0
STAT vb_replica_ops_update 596566447
STAT vb_replica_perc_mem_resident 100
STAT vb_replica_queue_age 1028068000
STAT vb_replica_queue_drain 807877243
STAT vb_replica_queue_fill 736951020
STAT vb_replica_queue_memory 463320
STAT vb_replica_queue_pending 8055141
STAT vb_replica_queue_size 5265
STAT version 1.4.4_461_gf99c147
Membase统计数据(Couchbase因为我们已升级到1.8.0)
arith_cmd (974447 total)
1us - 2us : ( 0.00%) 1
2us - 4us : ( 0.01%) 53
4us - 8us : ( 0.03%) 201
8us - 16us : ( 1.03%) 9773 #
16us - 32us : ( 92.06%) 887081 #####################################################################################################################################################################################
32us - 64us : ( 97.53%) 53314 ##########
64us - 128us : ( 99.82%) 22276 ####
128us - 256us : (100.00%) 1723
256us - 512us : (100.00%) 11
512us - 1ms : (100.00%) 9
1ms - 2ms : (100.00%) 5
data_age (20636834 total)
0 - 1s : ( 0.87%) 179419 #
1s - 2s : ( 6.83%) 1229526 ###########
2s - 4s : ( 16.33%) 1962033 ##################
4s - 7s : ( 26.81%) 2161109 ####################
7s - 10s : ( 33.92%) 1468459 ##############
10s - 16s : ( 43.93%) 2064852 ###################
16s - 23s : ( 52.64%) 1798684 #################
23s - 34s : ( 63.39%) 2218079 #####################
34s - 49s : ( 73.93%) 2174966 ####################
49s - 1m : ( 84.78%) 2237841 #####################
1m - 1m : ( 93.53%) 1807563 #################
1m - 2m : ( 97.64%) 848159 ########
2m - 3m : ( 98.96%) 271698 ##
3m - 4m : ( 99.67%) 145530 #
4m - 6m : ( 99.91%) 50132
6m - 9m : ( 99.95%) 8945
9m - 12m : (100.00%) 8831
12m - 17m : (100.00%) 1008
disk_commit (29439 total)
0 - 1s : ( 27.51%) 8098 #######################################################
1s - 2s : ( 28.23%) 214 #
2s - 4s : ( 43.74%) 4565 ###############################
4s - 7s : ( 62.51%) 5524 #####################################
7s - 10s : ( 83.00%) 6032 #########################################
10s - 16s : ( 96.00%) 3827 ##########################
16s - 23s : ( 99.78%) 1115 #######
23s - 34s : (100.00%) 63
34s - 49s : (100.00%) 1
disk_del (3864702 total)
1us - 2us : ( 0.00%) 2
2us - 4us : ( 23.32%) 901064 ##############################################
4us - 8us : ( 68.06%) 1729127 ########################################################################################
8us - 16us : ( 95.18%) 1048267 #####################################################
16us - 32us : ( 98.97%) 146376 #######
32us - 64us : ( 99.34%) 14471
64us - 128us : ( 99.98%) 24694 #
128us - 256us : (100.00%) 615
256us - 512us : (100.00%) 34
512us - 1ms : (100.00%) 27
1ms - 2ms : (100.00%) 24
2ms - 4ms : (100.00%) 1
disk_insert (3720767 total)
2us - 4us : ( 16.68%) 620801 #################################
4us - 8us : ( 46.24%) 1099803 ##########################################################
8us - 16us : ( 94.96%) 1812778 ################################################################################################
16us - 32us : ( 98.76%) 141193 #######
32us - 64us : ( 99.19%) 15989
64us - 128us : ( 99.98%) 29502 #
128us - 256us : (100.00%) 619
256us - 512us : (100.00%) 34
512us - 1ms : (100.00%) 31
1ms - 2ms : (100.00%) 16
2ms - 4ms : (100.00%) 1
disk_invalid_item_del (1 total)
1us - 2us : (100.00%) 1 ############################################################################################################################################################################################################
disk_invalid_vbtable_del (294 total)
512us - 1ms : ( 81.29%) 239 ####################################################################################################################################################################
1ms - 2ms : (100.00%) 55 #####################################
disk_update (16891367 total)
2us - 4us : ( 0.00%) 617
4us - 8us : ( 35.44%) 5985479 ######################################################################
8us - 16us : ( 88.98%) 9043092 ##########################################################################################################
16us - 32us : ( 95.49%) 1099896 ############
32us - 64us : ( 97.40%) 323792 ###
64us - 128us : ( 99.49%) 351597 ####
128us - 256us : ( 99.96%) 79771
256us - 512us : (100.00%) 6809
512us - 1ms : (100.00%) 205
1ms - 2ms : (100.00%) 108
2ms - 4ms : (100.00%) 1
get_cmd (271990315 total)
0 - 1us : ( 0.00%) 3098
1us - 2us : ( 0.42%) 1134954
2us - 4us : ( 26.85%) 71886528 ###################################################
4us - 8us : ( 96.55%) 189592751 ########################################################################################################################################
8us - 16us : ( 99.21%) 7231257 #####
16us - 32us : ( 99.54%) 885000
32us - 64us : ( 99.56%) 67815
64us - 128us : (100.00%) 1177254
128us - 256us : (100.00%) 8664
256us - 512us : (100.00%) 1087
512us - 1ms : (100.00%) 1486
1ms - 2ms : (100.00%) 402
2ms - 4ms : (100.00%) 15
4ms - 8ms : (100.00%) 2
8ms - 16ms : (100.00%) 2
set_vb_cmd (7680 total)
0 - 1us : ( 0.08%) 6
1us - 2us : ( 11.60%) 885 #######################
2us - 4us : ( 90.59%) 6066 ##############################################################################################################################################################
4us - 8us : ( 98.80%) 631 ################
8us - 16us : ( 99.67%) 67 #
16us - 32us : ( 99.82%) 11
64us - 128us : ( 99.96%) 11
128us - 256us : ( 99.99%) 2
512us - 1ms : (100.00%) 1
storage_age (20636834 total)
0 - 1s : ( 0.39%) 79479
1s - 2s : ( 1.73%) 278232 ##
2s - 4s : ( 2.91%) 242634 ##
4s - 7s : ( 4.18%) 263057 ##
7s - 10s : ( 5.25%) 219060 ##
10s - 16s : ( 7.25%) 412722 ###
16s - 23s : ( 9.97%) 562678 #####
23s - 34s : ( 15.01%) 1039406 #########
34s - 49s : ( 22.71%) 1589602 ###############
49s - 1m : ( 32.58%) 2036916 ###################
1m - 1m : ( 43.13%) 2176751 ####################
1m - 2m : ( 51.79%) 1786380 #################
2m - 3m : ( 59.80%) 1653388 ###############
3m - 4m : ( 69.64%) 2031245 ###################
4m - 6m : ( 81.27%) 2399569 #######################
6m - 9m : ( 94.47%) 2724318 ##########################
9m - 12m : ( 99.52%) 1042119 #########
12m - 17m : ( 99.73%) 44332
17m - 24m : ( 99.90%) 34353
24m - 34m : (100.00%) 20552
34m - 48m : (100.00%) 41
store_cmd (32950985 total)
1us - 2us : ( 0.00%) 296
2us - 4us : ( 0.25%) 83421
4us - 8us : ( 2.63%) 782380 ####
8us - 16us : ( 91.01%) 29122181 ##############################################################################################################################################################################
16us - 32us : ( 98.48%) 2461695 ##############
32us - 64us : ( 98.77%) 95381
64us - 128us : ( 99.96%) 394021 ##
128us - 256us : (100.00%) 10725
256us - 512us : (100.00%) 538
512us - 1ms : (100.00%) 260
1ms - 2ms : (100.00%) 77
2ms - 4ms : (100.00%) 9
8ms - 16ms : (100.00%) 1
tap_mutation (15722391 total)
1us - 2us : ( 0.00%) 4
2us - 4us : ( 0.18%) 28274
4us - 8us : ( 34.16%) 5342711 ###################################################################
8us - 16us : ( 92.32%) 9144157 ###################################################################################################################
16us - 32us : ( 98.76%) 1012068 ############
32us - 64us : ( 99.04%) 44704
64us - 128us : ( 99.98%) 147357 #
128us - 256us : (100.00%) 2927
256us - 512us : (100.00%) 93
512us - 1ms : (100.00%) 73
1ms - 2ms : (100.00%) 20
2ms - 4ms : (100.00%) 3
答案 0 :(得分:0)
发现问题。
问题出现在Coushbase / Membase集群列表中。在我们的集群上,我们有两个本地(1Tbit)和全局(100Mbit)的网络接口。所有后端服务器都有本地Moxi代理,用于连接Couchbase。在Moxi群集配置Couchbase服务器具有本地IP,但是当moxi连接到Couchbase时,获取服务器列表Couchbase会返回全局IP,并且由于全局网络接口负载很重,我们会遇到响应时间问题。
因此修复的变体很少:
我认为Couchbase需要通过接口分离连接,甚至更好地使这种行为可配置。此行为也应添加到文档中,因为它不明显。