Cassandra - 新节点bootstrapped - 不压缩

时间:2017-04-14 19:40:30

标签: cassandra cassandra-2.0

我有一个21节点集群(C * 2.2)的m4.2xlarges,每个集群有5个1TB SSD。

当它满50%时(每个节点500GB * 5 = 2.5 TB),我意识到我需要更多空间,所以我添加了一个新节点。

这个新节点加入了集群(从UJ到UN),但磁盘使用率为4.2TB。

我认为这是由于压缩滞后并等待了几天。即使发生了压缩,磁盘使用情况也没有改变。新的盒子实际上是CPU绑定的,所以我将它提升到Compute优化的c4.8xlarge框并将concurrent_compactions加速到20并禁用compaction_throughput限制以完成此操作。

同时我停止了对集群的所有写入。挂起的Compactions正在上升和上升,磁盘上的数据不会下降。

我做错了什么?系统时间看起来很高。我正在使用org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy和 当前压实阈值是min = 4,max = 32

当我这样做的时候 strace -f -c -p cassandra-pid > strace_count

    % time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 49.57 7431.363672      140392     52933     17755 futex
 30.22 4530.012667      482481      9389           epoll_wait
 11.33 1697.685882     2143543       792           recvfrom
  3.68  551.306817        1596    345500         7 write
  3.58  537.257283    14138350        38        33 restart_syscall
  0.78  117.381206      111262      1055           poll
  0.28   41.738677         636     65675           lseek
  0.14   21.138626        1659     12741           pread
  0.10   15.189009        1838      8265           read
  0.07    9.898101         696     14229           sched_yield
  0.06    8.984107       23831       377           sendto
  0.04    6.148230        9759       630           munmap
  0.04    5.760339       21902       263           mprotect
  0.02    3.154839         992      3181       359 fadvise64
  0.02    3.107529         652      4769       215 stat
  0.01    2.006363      167197        12           msync
  0.01    1.956998        7040       278           mmap
  0.01    1.838682        1155      1592         8 unlink
  0.01    1.080512         602      1794           lstat
  0.01    0.861741         578      1490           close
  0.00    0.626903         562      1116           open
  0.00    0.596450         588      1014           fcntl
  0.00    0.440250         644       684           fstat
  0.00    0.318874         630       506           epoll_ctl
  0.00    0.249772        4625        54           fdatasync
  0.00    0.149440        1660        90           fsync
  0.00    0.093154         647       144           rename
  0.00    0.069017         575       120           statfs
  0.00    0.018136         356        51           getpriority
  0.00    0.014358         598        24           rt_sigprocmask
  0.00    0.011584         161        72           times
  0.00    0.009858         616        16           setsockopt
  0.00    0.009396         940        10           link
  0.00    0.008072          24       336         7 rt_sigreturn
  0.00    0.004960        1240         4           getsockopt
  0.00    0.004926         411        12           sched_getaffinity
  0.00    0.004503         500         9           dup2
  0.00    0.002998         500         6           madvise
  0.00    0.002693         449         6           set_robust_list
  0.00    0.002597         433         6           accept
  0.00    0.002000         333         6           clone
  0.00    0.002000         500         4         2 accept4
  0.00    0.001243         207         6           gettid
  0.00    0.001000         500         2           writev
  0.00    0.001000         500         2           recvmsg
  0.00    0.001000         143         7           getsockname
  0.00    0.001000         500         2           getpeername
  0.00    0.001000         167         6         6 setpriority
  0.00    0.000000           0         1           socket
  0.00    0.000000           0         1           bind
------ ----------- ----------- --------- --------- ----------------
100.00 14990.519464                529320     18392 total

当我做顶级时 - 1:

Tasks: 1506 total,   8 running, 1496 sleeping,   0 stopped,   2 zombie
Cpu0  :  0.3%us, 47.3%sy, 10.5%ni, 41.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  :  0.7%us, 87.6%sy, 11.7%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  :  3.2%us, 65.0%sy,  0.0%ni, 31.8%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  : 11.6%us, 39.9%sy,  0.0%ni, 48.5%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu4  :  1.0%us, 55.3%sy,  9.2%ni, 34.6%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu5  :  0.3%us, 98.0%sy,  1.7%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu6  :  0.4%us, 90.7%sy,  1.4%ni,  6.8%id,  0.7%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu7  :  3.4%us, 20.2%sy,  9.4%ni, 64.0%id,  3.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu8  :  1.7%us, 24.9%sy,  0.3%ni, 73.1%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu9  :  0.7%us, 79.4%sy,  0.7%ni, 18.9%id,  0.3%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu10 :  0.7%us, 64.9%sy, 13.6%ni, 14.0%id,  6.8%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu11 :  1.0%us, 50.7%sy,  0.0%ni, 18.6%id, 29.7%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu12 :  0.3%us, 58.9%sy,  0.0%ni, 40.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu13 :  0.3%us, 72.5%sy, 26.8%ni,  0.0%id,  0.3%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu14 :  0.0%us, 50.2%sy, 49.8%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu15 :  0.0%us,100.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu16 :  0.3%us, 54.2%sy,  0.0%ni, 40.5%id,  5.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu17 :  0.7%us, 46.3%sy, 19.9%ni, 24.0%id,  9.1%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu18 :  0.7%us, 68.9%sy,  0.0%ni, 30.5%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu19 :  5.7%us,  3.4%sy,  0.0%ni, 90.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu20 :  0.7%us, 44.4%sy,  0.0%ni, 54.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu21 :  1.3%us, 67.8%sy,  0.0%ni, 30.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu22 :  0.7%us, 45.5%sy,  7.3%ni, 42.9%id,  3.6%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu23 :  1.3%us, 22.7%sy,  0.0%ni, 75.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu24 :  0.0%us, 65.4%sy,  0.0%ni, 34.6%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu25 :  0.0%us, 62.0%sy, 12.2%ni, 25.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu26 :  1.3%us, 68.9%sy, 12.6%ni, 17.2%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu27 :  0.0%us, 64.3%sy, 12.9%ni, 22.8%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu28 :  0.0%us, 75.8%sy,  0.0%ni, 23.5%id,  0.7%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu29 :  0.0%us, 60.3%sy,  1.7%ni, 37.4%id,  0.7%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu30 :  0.3%us, 48.3%sy, 12.7%ni, 38.0%id,  0.7%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu31 :  0.0%us,100.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu32 :  0.0%us, 72.1%sy, 25.2%ni,  0.0%id,  2.7%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu33 :  0.0%us,100.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu34 :  0.3%us, 66.7%sy,  0.0%ni, 33.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu35 :  0.0%us, 67.7%sy,  0.0%ni, 32.3%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  61820728k total, 61610932k used,   209796k free,      456k buffers
Swap:        0k total,        0k used,        0k free, 35425968k cached

nodetool compactionstats

      pending tasks: 281
  id   compaction type     keyspace               table     completed         total    unit   progress
  id        Compaction   keyspace_1         table_____4    1591902797    2851758523   bytes     55.82%
  id        Compaction   keyspace_1         table_____1     193582898     567222689   bytes     34.13%
  id        Compaction   keyspace_1         table_____2     187022078    2264168754   bytes      8.26%
  id        Compaction   keyspace_1         table_____1   22841754587   24781014960   bytes     92.17%
  id        Compaction   keyspace_1         table_____5     764633368    3904191508   bytes     19.58%
  id        Compaction   keyspace_1         table_____1    1856076066    2326634436   bytes     79.78%
  id        Compaction   keyspace_1         table_____7     254856804     499133271   bytes     51.06%
  id        Compaction   keyspace_1         table_____8    1406859449    1803885628   bytes     77.99%
  id        Compaction   keyspace_1         table_____7    1734201253    2308801656   bytes     75.11%
  id        Compaction   keyspace_1         table_____1     656195289     931867447   bytes     70.42%
  id        Compaction   keyspace_1         table_____1     657036608    1380870812   bytes     47.58%
  id        Compaction   keyspace_1         table_____1     235054945   18957522878   bytes      1.24%
  id        Compaction   keyspace_1         table____10       2351049       3552009   bytes     66.19%
  id        Compaction   keyspace_1         table_____2     810635522     867307196   bytes     93.47%
  id        Compaction   keyspace_1         table_____5     281573682     780375396   bytes     36.08%
  id        Compaction   keyspace_1         table_____6    2350396501    2398745060   bytes     97.98%
  id        Compaction   keyspace_1         table_____1      63122362     434443651   bytes     14.53%
  id        Compaction   keyspace_1         table_____3     287859748     399896319   bytes     71.98%
  id        Compaction   keyspace_1         table_____2    1776310557    2685522257   bytes     66.14%
  id        Compaction   keyspace_1         table_____1     494183426   22432529013   bytes      2.20%

nodetool compactionhistory: 这里有很多行,但这里有一个样本:

id  datatype index        1492056758751             558756         540336         {1:175, 2:6}
id  datatype index     1492075503279             128269         114446         {1:1160, 2:31}
id  datatype index     1492072165446             22914902       22464994       {1:626, 2:37}
id  datatype index   1492060375419             73514456       72842367       {1:398795, 2:7294, 3:300}
id  datatype index    1492075160893             85707          64387          {1:236, 2:41}
id  datatype index      1492151303774             139172156      134666782      {1:9129, 2:3313, 3:935, 4:112}
id  datatype index    1492135037619             30839157       29690968       {1:32854, 2:5702, 3:535, 4:61}
id  datatype index   1492075521048             255030         253531         {1:220, 2:6}
id  datatype index        1492116936213             11391100       10943344       {1:6798, 2:301}
id  datatype index    1492075649703             1527580        1486442        {1:5381, 2:330}
id  datatype index          1492153054713             218401839      216306589      {1:6669, 2:1068, 3:273, 4:22}
id  datatype index   1492169550324             9172160        8724129        {1:42943, 2:2390}
id  datatype index   1492087845445             8086487        7810261        {1:8445, 2:1209, 3:95}
id  datatype index    1492116806390             837169         806946         {1:5984, 2:262}
id  datatype index   1492167939189             275277987      271618327      {1:38585, 2:18745, 3:494}
id  datatype index             277471932      266321389      {1:47184, 2:16047, 3:367, 4:468}
id  datatype index        1492116559239             1569590        1402724        {1:460, 2:62}
id  datatype index 1492173763782             83298080       81977056       {1:36383, 2:7577, 3:3565, 4:95, 6:169}
id  datatype index      1492158247355             42660621       40224352       {1:6565, 2:987, 3:316, 4:521, 6:17, 8:70}
id  datatype index      1492179061558             589874248      568901949      {1:16726, 2:9342, 3:1149, 4:141}
id  datatype index        1492190014331             807975203      786973389      {1:67311, 2:1852}
id  datatype index      1491949569125             45499223       46212100       {1:3944, 2:523, 3:1268, 4:262}
id  datatype index   1492063798113             2401           1134           {1:1, 2:3}
id  datatype index   1492100603829             7693737        7507021        {1:7112, 2:870, 3:235, 4:27}
id  datatype index 1492202653921             114122963      111721885      {1:2038, 2:2997, 3:1095, 4:48, 5:40}
id  datatype index      1492063653695             60700          50728          {1:157, 2:12}
id  datatype index 1492152115922             165656033      159591156      {1:5180, 2:3233, 3:600, 4:564, 5:37, 6:14, 7:12}
id  datatype index   1492160511587             3353867375     3280857307     {1:12265239, 2:409303, 3:16391, 4:1932}
id  datatype index        1492116638632             3226315        2863672        {1:956, 2:137}
id  datatype index 1492050334458             64407          56620          {1:447, 2:31}
id  datatype index      1492150640640             587181         424081         {1:1293, 2:218, 3:1}
id  datatype index   1492116731210             429668507      407404356      {1:2208562, 2:131875, 3:338}
id  datatype index 1492134210449             293003702      275992426      {1:7429, 2:1686, 3:165}
id  datatype index    1492171984560             8467649        8318775        {1:13330, 2:892, 3:11}
id  datatype index    1492150632348             424314         368270         {1:356, 2:72, 3:8}
id  datatype index          1492068676918             677842865      653983357      {1:11042, 2:405}
id  datatype index 1492160695008             11985228       11689655       {1:3684, 2:1390, 3:441, 4:87}
id  datatype index             5906438        5731218        {1:7040, 2:445, 3:27}
id  datatype index        1492132529903             234019313      220261439      {1:80014, 2:5316}
id  datatype index             1646302        1634070        {1:575, 2:17, 3:5}
id  datatype index   1492145903652             1544764        1527844        {1:1807, 2:295, 3:65, 4:3, 5:6}
id  datatype index   1492075180569             1034277        986605         {1:6591, 2:235}
id  datatype index   1491928723944             5823014        5811907        {1:6498}
id  datatype index    1492075323943             573147         526857         {1:4395, 2:250}

2 个答案:

答案 0 :(得分:1)

您的新节点应该最终关闭压缩间隙......

CPU不是压缩中的唯一约束,请检查 compaction_throughput_mb_per_sec 参数,并查看以下文章: https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsConfigureCompaction.html

请检查您的nodetool compactionstats,看看待处理任务的数量是否会随着时间的推移而减少。另外,请在此附上 nodetool cfstats 的输出。

作为替代方案,您可以尝试重新添加新节点, auto_bootstrap off ,然后运行 nodetool rebuild ,并最近修复,它应该更快在你的情况下。

编辑:

查看compactionstats后 - 尝试将 concurrent_compactors 属性降低到较低的值。执行需要更多时间,但应该对集群整体性能影响较小。

答案 1 :(得分:0)

如果您注意到已完成交易的 bytes_in bytes_out ,那么即使在完成如此多的压缩后您也看不到磁盘空间利用率发生巨大变化。

注意:如果它适合您的使用案例,您还应考虑使用 Leveled 压缩策略,因为它比尺寸分层具有许多优势。水平压实通常最适合大多数用例。这是一个很好的块,描述何时使用一个而不是其他。 http://www.datastax.com/dev/blog/when-to-use-leveled-compaction