Elasticseach:每秒索引10万个小文档,带有5个分片的更多索引似乎比1个大索引

时间:2016-09-08 09:56:27

标签: elasticsearch

摘要

我们的用例是每秒索引10万个小文档,搜索吞吐量小〜2-3个/秒。

此问题的初始版本基于2.3.1 / 2.3.2版本上运行的基准测试,其中为了达到所需的规模,我最终使用了更多具有5个分片的索引而不是具有更多分片的一个索引(我们尝试了1x,2x或4x number_of_nodes)。这对我们来说不是问题,因为我们已经将数据集划分用于其他目的,因此很容易为每个分区创建一个索引。

我在弹性搜索版本2.4.0上运行了快速基准测试,1个索引47个分片和10个索引之间的差异5个分片各不相同,但10个索引版本仍然表现更好。

然而,由于差异不显着,这个问题不再那么重要了。然而,只有基准数据可能对某些人有用,所以请在下面找到感兴趣的人的一些细节。

详细

Masashi Narumoto撰写的这篇文章Tuning data ingestion performance for Elasticsearch on Azure有一些有趣的基准,我在这里提出了19,47,83的质数分数。

  

测试还说明了另一个重点。在这种情况下,增加节点数可以提高数据提取吞吐量,但结果不一定是线性扩展。对12个和15个节点集群进行进一步测试可以显示横向扩展带来的额外好处。如果此节点数量不足,则可能需要返回扩展策略并开始使用基于高级存储的更多或更大的磁盘。

使用Elasticsearch版本2.4.0,群集有12个节点,16GB / 8核/ SSD数字海洋飞沫。

enter image description here

首先使用Java API(TransportClient)的旧版本2.3.2,然后将客户端升级到版本2.4.0。

请参见下图,显示使用10个索引时性能稍好一些: enter image description here

3x8GB数字海洋水滴用作客户端节点。每个节点模拟10个传输客户端,每个客户端有2个并发请求。

示例批量大小:

bulk finished 7600, bytes: 5243143
bulk finished 7600, bytes: 5243143
bulk finished 7600, bytes: 5243235
bulk finished 7600, bytes: 5243143
bulk finished 7571, bytes: 5242948
bulk finished 7571, bytes: 5243040
bulk finished 7601, bytes: 5243354
bulk finished 7600, bytes: 5243235
bulk finished 7600, bytes: 5243235
bulk finished 7601, bytes: 5243354
bulk finished 7600, bytes: 5243235
bulk finished 7601, bytes: 5243272
bulk finished 7627, bytes: 5243251

每次测试之间:

  • 客户已停止
  • 所有索引都是DELETEd
  • 映射已更改
  • 客户端重新配置为使用1个索引或10个索引并开始备份

映射

  "settings" : {
      "number_of_shards" : 47, 
      "number_of_replicas": 1,
      "refresh_interval": "30s",
      "index.translog.flush_threshold_size": "2gb",
      "index.codec": "best_compression"
  },
  "mappings": {
    "MessageRecord": {
      "_source": {
        "enabled": true
      },
      "properties": {
        "timestamp": { "type": "date"},
        "host": { "type": "string", "index": "not_analyzed" },
        "IP": { "type": "string", "index": "not_analyzed" },
        "process": { "type": "integer"},
        "username": { "type": "string", "index": "not_analyzed" },
        "email": { "type": "string", "index": "not_analyzed" },
        "sessionId": { "type": "string", "index": "not_analyzed" },
        "APIName": { "type": "string", "index": "not_analyzed" },
        "componentName": { "type": "string", "index": "not_analyzed" },
        "className": { "type": "string", "index": "not_analyzed" },
        "methodName": { "type": "string", "index": "not_analyzed" },
        "message": { "type": "string" }
      }
    }
  }

12个节点,1个索引,19个分片,1个副本

碎片

curl -s localhost:9200/_cat/shards?v | perl -ne '@v=split(/\s+/); $h{$v[7]} .= "$v[1]$v[2],"; END{foreach $k (sort(keys(%h))) {print "$k: $h{$k}\n"}}'
db1: 3p,16r,15p,     
db10: 13r,5p,17p,    
db11: 16p,12r,15r,4p,
db12: 18r,11p,5r,    
db2: 14r,18p,6p,     
db3: 9p,8r,2r,       
db4: 10p,17r,4r,     
db5: 9r,12p,0p,      
db6: 8p,7r,1r,       
db7: 14p,11r,2p,     
db8: 1p,13p,10r,     
db9: 3r,7p,6r,0r,    

示例分片

blabla-2016.09.12_0 8  p STARTED 546347 390.3mb 10.135.32.20  db6
blabla-2016.09.12_0 8  r STARTED 565703 407.6mb 10.135.35.193 db3
blabla-2016.09.12_0 16 r STARTED 573101 493.9mb 10.135.34.141 db1
blabla-2016.09.12_0 16 p STARTED 536669 370.5mb 10.135.35.41  db11

12个节点,1个索引,47个分片,1个副本

碎片

db1: 15p,34r,25r,45r,27p,11r,39p,3p,
db2: 42p,30p,20r,31r,6p,37r,18p,7r, 
db3: 42r,33p,32r,21p,10r,23r,9p,45p,
db4: 13r,24r,46p,34p,10p,35r,22p,   
db5: 1r,15r,24p,12p,36p,14r,39r,0p, 
db6: 30r,38r,20p,32p,44p,9r,22r,8p, 
db7: 38p,1p,2p,4r,14p,26p,27r,17r,  
db8: 13p,16r,2r,46r,26r,25p,37p,43r,
db9: 33r,31p,19p,21r,41r,43p,7p,8r, 
db10: 28r,40r,5p,29p,19r,6r,41p,17p,
db11: 28p,40p,16p,5r,29r,4p,44r,18r,
db12: 12r,36r,35p,23p,11p,3r,0r,    

示例iostat

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
vda               0.00    71.73    4.15   49.15    33.17  3739.81   141.58     0.28    5.24    0.51    5.64   0.38   2.03
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
vda               0.00  1027.20    0.20  563.60     1.60 30052.00   106.61     2.10    3.73    0.00    3.73   0.33  18.44
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
vda               0.00   867.60    0.00  407.60     0.00 51499.20   252.69     4.19   10.27    0.00   10.27   0.37  15.06
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
vda               0.00   424.20    0.00  296.00     0.00 20040.80   135.41     1.74    5.88    0.00    5.88   0.40  11.80
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
vda               0.00   661.20    0.00  466.80     0.00 11922.40    51.08     0.19    0.41    0.00    0.41   0.36  16.66

示例分片

blabla-2016.09.12_0 42    r      STARTED 574413 399.1mb 10.135.35.193 db3
blabla-2016.09.12_0 28    p      STARTED 557903 399.9mb 10.135.35.41  db11
blabla-2016.09.12_0 28    r      STARTED 557903 394.9mb 10.135.36.136 db10
blabla-2016.09.12_0 33    r      STARTED 412963 286.8mb 10.135.32.60  db9

12个节点,1个索引,83个分片,1个副本

碎片

db1: 39p,24r,33r,74r,69p,51p,23r,8p,27p,15p,3p,11r,67r,0r,  
db2: 40r,42p,69r,7r,50r,56r,18p,65p,79r,77p,17r,29r,30p,54p,
db3: 78r,57p,32r,33p,80p,53r,9p,21p,68p,20r,59r,45p,43r,72r,
db4: 82r,81p,58p,22p,35r,34p,70p,21r,75r,64r,44r,54r,46p,   
db5: 24p,13r,36p,37r,25r,60p,12p,68r,1r,46r,62r,72p,48p,0p, 
db6: 32p,42r,19r,31r,58r,9r,56p,71r,79p,77r,20p,52r,44p,67p,
db7: 74p,38p,50p,73r,60r,26p,8r,61r,10r,15r,14p,2p,62p,3r,  
db8: 13p,36r,37p,73p,25p,12r,6p,47r,1p,49p,26r,61p,10p,2r,  
db9: 78p,57r,19p,31p,66p,7p,51r,41r,18r,70r,6r,55p,30r,43p, 
db10: 39r,5p,80r,28r,53p,4r,16r,41p,76p,64p,17p,29p,49r,63r,
db11: 40p,5r,38r,28p,66r,4p,16p,76r,75p,52p,63p,27r,14r,48r,
db12: 82p,81r,22r,35p,34r,65r,71p,59p,47p,23p,55r,45r,11p,  

示例分片

blabla-2016.09.12_0 82    r      STARTED 100491  88.3mb 10.135.32.114 db4
blabla-2016.09.12_0 82    p      STARTED 100491  91.3mb 10.135.35.169 db12
blabla-2016.09.12_0 39    p      STARTED 178524   145mb 10.135.34.141 db1
blabla-2016.09.12_0 39    r      STARTED 178524 144.8mb 10.135.36.136 db10
blabla-2016.09.12_0 40    p      STARTED 289205   220mb 10.135.35.41  db11
blabla-2016.09.12_0 40    r      STARTED 289205 216.7mb 10.135.35.108 db2

12个节点,10个索引,每个5个分片,1个副本

碎片

db1: 4/4r,7/3p,6/1p,1/3p,0/3p,3/2r,2/0r,9/2r,8/0r,   
db2: 5/4p,4/2p,7/4r,6/2r,1/4r,3/0p,9/0p,             
db3: 5/2r,4/0r,6/1r,0/3r,3/3p,2/1p,9/3p,8/1p,        
db4: 5/0p,7/4p,6/2p,1/4p,0/4p,3/3r,2/1r,9/3r,8/1r,   
db5: 5/3r,4/1r,7/0p,1/0p,0/0p,3/4p,2/2p,9/4p,8/2p,   
db6: 5/1r,4/4p,6/4r,0/4r,3/2p,2/0p,9/2p,8/0p,        
db7: 4/3r,7/2p,6/0p,1/2p,0/2p,3/1r,2/4p,9/1r,8/4p,   
db8: 5/4r,4/2r,7/1p,1/1p,0/1p,3/0r,2/3p,9/0r,8/3p,   
db9: 5/0r,4/3p,7/3r,6/3r,1/3r,0/2r,3/1p,9/1p,        
db10: 5/3p,4/1p,7/2r,6/0r,1/2r,0/1r,2/4r,8/4r,       
db11: 5/1p,7/0r,6/3p,1/0r,0/0r,3/4r,2/2r,9/4r,8/2r,  
db12: 5/2p,4/0p,7/1r,6/4p,1/1r,2/3r,8/3r,            

示例iostat

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
vda               0.00   103.87    7.07   72.41    45.14  5544.77   140.66     0.40    4.98    0.54    5.41   0.34   2.70
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
vda               0.00    62.80    7.40  136.40    30.40 52532.80   731.06     4.47   31.10    0.03   32.79   0.40   5.80
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
vda               0.00    73.20    0.00   30.40     0.00  5904.00   388.42     0.04    1.27    0.00    1.27   0.55   1.68
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
vda               0.00    40.60    0.20   32.00     0.80  6009.60   373.32     0.05    1.44    0.00    1.45   0.55   1.78
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
vda               0.00    44.00    0.00   34.00     0.00  6209.60   365.27     0.04    1.24    0.00    1.24   0.53   1.80
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
vda               0.00    44.00    0.00   31.80     0.00  5554.40   349.33     0.04    1.23    0.00    1.23   0.55   1.74
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
vda               0.00    45.60    0.00   35.40     0.00  6614.40   373.69     0.05    1.49    0.00    1.49   0.62   2.18
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
vda               0.00    55.20    0.00   33.00     0.00  5828.00   353.21     0.04    1.32    0.00    1.32   0.59   1.94
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
vda               0.00    94.60    0.20   96.60     0.80 34596.80   714.83     2.65   27.41    0.00   27.47   0.44   4.30
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
vda               0.00    45.60    0.20   75.60     0.80 25184.80   664.53     1.55   20.39    0.00   20.45   0.45   3.40
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
vda               0.00    45.60    0.00   32.60     0.00  6132.00   376.20     0.04    1.28    0.00    1.28   0.55   1.78
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
vda               0.00    57.00    0.00   41.40     0.00  7424.00   358.65     0.05    1.31    0.00    1.31   0.52   2.16
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
vda               0.00    40.20    0.00   27.20     0.00  4672.80   343.59     0.03    1.25    0.00    1.25   0.60   1.62

示例分片

blabla-2016.09.12_5 2     p      STARTED 270216 229.5mb 10.135.35.169 db12
blabla-2016.09.12_5 2     r      STARTED 291290 243.5mb 10.135.35.193 db3
blabla-2016.09.12_5 0     p      STARTED 266065 225.6mb 10.135.32.114 db4
blabla-2016.09.12_5 0     r      STARTED 273731 230.7mb 10.135.32.60  db9
blabla-2016.09.12_4 2     r      STARTED 295275 241.5mb 10.135.36.175 db8
blabla-2016.09.12_4 2     p      STARTED 295275 212.8mb 10.135.35.108 db2
blabla-2016.09.12_4 4     r      STARTED 280898   240mb 10.135.34.141 db1
blabla-2016.09.12_4 4     p      STARTED 265626 345.2mb 10.135.32.20  db6

0 个答案:

没有答案