我有一个21节点集群(C * 2.2)的m4.2xlarges,每个集群有5个1TB SSD。
当它满50%时(每个节点500GB * 5 = 2.5 TB),我意识到我需要更多空间,所以我添加了一个新节点。
这个新节点加入了集群(从UJ到UN),但磁盘使用率为4.2TB。
我认为这是由于压缩滞后并等待了几天。即使发生了压缩,磁盘使用情况也没有改变。新的盒子实际上是CPU绑定的,所以我将它提升到Compute优化的c4.8xlarge框并将concurrent_compactions加速到20并禁用compaction_throughput限制以完成此操作。
同时我停止了对集群的所有写入。挂起的Compactions正在上升和上升,磁盘上的数据不会下降。
我做错了什么?系统时间看起来很高。我正在使用org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy和 当前压实阈值是min = 4,max = 32
当我这样做的时候
strace -f -c -p cassandra-pid > strace_count
:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
49.57 7431.363672 140392 52933 17755 futex
30.22 4530.012667 482481 9389 epoll_wait
11.33 1697.685882 2143543 792 recvfrom
3.68 551.306817 1596 345500 7 write
3.58 537.257283 14138350 38 33 restart_syscall
0.78 117.381206 111262 1055 poll
0.28 41.738677 636 65675 lseek
0.14 21.138626 1659 12741 pread
0.10 15.189009 1838 8265 read
0.07 9.898101 696 14229 sched_yield
0.06 8.984107 23831 377 sendto
0.04 6.148230 9759 630 munmap
0.04 5.760339 21902 263 mprotect
0.02 3.154839 992 3181 359 fadvise64
0.02 3.107529 652 4769 215 stat
0.01 2.006363 167197 12 msync
0.01 1.956998 7040 278 mmap
0.01 1.838682 1155 1592 8 unlink
0.01 1.080512 602 1794 lstat
0.01 0.861741 578 1490 close
0.00 0.626903 562 1116 open
0.00 0.596450 588 1014 fcntl
0.00 0.440250 644 684 fstat
0.00 0.318874 630 506 epoll_ctl
0.00 0.249772 4625 54 fdatasync
0.00 0.149440 1660 90 fsync
0.00 0.093154 647 144 rename
0.00 0.069017 575 120 statfs
0.00 0.018136 356 51 getpriority
0.00 0.014358 598 24 rt_sigprocmask
0.00 0.011584 161 72 times
0.00 0.009858 616 16 setsockopt
0.00 0.009396 940 10 link
0.00 0.008072 24 336 7 rt_sigreturn
0.00 0.004960 1240 4 getsockopt
0.00 0.004926 411 12 sched_getaffinity
0.00 0.004503 500 9 dup2
0.00 0.002998 500 6 madvise
0.00 0.002693 449 6 set_robust_list
0.00 0.002597 433 6 accept
0.00 0.002000 333 6 clone
0.00 0.002000 500 4 2 accept4
0.00 0.001243 207 6 gettid
0.00 0.001000 500 2 writev
0.00 0.001000 500 2 recvmsg
0.00 0.001000 143 7 getsockname
0.00 0.001000 500 2 getpeername
0.00 0.001000 167 6 6 setpriority
0.00 0.000000 0 1 socket
0.00 0.000000 0 1 bind
------ ----------- ----------- --------- --------- ----------------
100.00 14990.519464 529320 18392 total
当我做顶级时 - 1:
Tasks: 1506 total, 8 running, 1496 sleeping, 0 stopped, 2 zombie
Cpu0 : 0.3%us, 47.3%sy, 10.5%ni, 41.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu1 : 0.7%us, 87.6%sy, 11.7%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu2 : 3.2%us, 65.0%sy, 0.0%ni, 31.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu3 : 11.6%us, 39.9%sy, 0.0%ni, 48.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu4 : 1.0%us, 55.3%sy, 9.2%ni, 34.6%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu5 : 0.3%us, 98.0%sy, 1.7%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu6 : 0.4%us, 90.7%sy, 1.4%ni, 6.8%id, 0.7%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu7 : 3.4%us, 20.2%sy, 9.4%ni, 64.0%id, 3.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu8 : 1.7%us, 24.9%sy, 0.3%ni, 73.1%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu9 : 0.7%us, 79.4%sy, 0.7%ni, 18.9%id, 0.3%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu10 : 0.7%us, 64.9%sy, 13.6%ni, 14.0%id, 6.8%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu11 : 1.0%us, 50.7%sy, 0.0%ni, 18.6%id, 29.7%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu12 : 0.3%us, 58.9%sy, 0.0%ni, 40.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu13 : 0.3%us, 72.5%sy, 26.8%ni, 0.0%id, 0.3%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu14 : 0.0%us, 50.2%sy, 49.8%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu15 : 0.0%us,100.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu16 : 0.3%us, 54.2%sy, 0.0%ni, 40.5%id, 5.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu17 : 0.7%us, 46.3%sy, 19.9%ni, 24.0%id, 9.1%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu18 : 0.7%us, 68.9%sy, 0.0%ni, 30.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu19 : 5.7%us, 3.4%sy, 0.0%ni, 90.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu20 : 0.7%us, 44.4%sy, 0.0%ni, 54.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu21 : 1.3%us, 67.8%sy, 0.0%ni, 30.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu22 : 0.7%us, 45.5%sy, 7.3%ni, 42.9%id, 3.6%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu23 : 1.3%us, 22.7%sy, 0.0%ni, 75.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu24 : 0.0%us, 65.4%sy, 0.0%ni, 34.6%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu25 : 0.0%us, 62.0%sy, 12.2%ni, 25.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu26 : 1.3%us, 68.9%sy, 12.6%ni, 17.2%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu27 : 0.0%us, 64.3%sy, 12.9%ni, 22.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu28 : 0.0%us, 75.8%sy, 0.0%ni, 23.5%id, 0.7%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu29 : 0.0%us, 60.3%sy, 1.7%ni, 37.4%id, 0.7%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu30 : 0.3%us, 48.3%sy, 12.7%ni, 38.0%id, 0.7%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu31 : 0.0%us,100.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu32 : 0.0%us, 72.1%sy, 25.2%ni, 0.0%id, 2.7%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu33 : 0.0%us,100.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu34 : 0.3%us, 66.7%sy, 0.0%ni, 33.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu35 : 0.0%us, 67.7%sy, 0.0%ni, 32.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 61820728k total, 61610932k used, 209796k free, 456k buffers
Swap: 0k total, 0k used, 0k free, 35425968k cached
nodetool compactionstats
pending tasks: 281
id compaction type keyspace table completed total unit progress
id Compaction keyspace_1 table_____4 1591902797 2851758523 bytes 55.82%
id Compaction keyspace_1 table_____1 193582898 567222689 bytes 34.13%
id Compaction keyspace_1 table_____2 187022078 2264168754 bytes 8.26%
id Compaction keyspace_1 table_____1 22841754587 24781014960 bytes 92.17%
id Compaction keyspace_1 table_____5 764633368 3904191508 bytes 19.58%
id Compaction keyspace_1 table_____1 1856076066 2326634436 bytes 79.78%
id Compaction keyspace_1 table_____7 254856804 499133271 bytes 51.06%
id Compaction keyspace_1 table_____8 1406859449 1803885628 bytes 77.99%
id Compaction keyspace_1 table_____7 1734201253 2308801656 bytes 75.11%
id Compaction keyspace_1 table_____1 656195289 931867447 bytes 70.42%
id Compaction keyspace_1 table_____1 657036608 1380870812 bytes 47.58%
id Compaction keyspace_1 table_____1 235054945 18957522878 bytes 1.24%
id Compaction keyspace_1 table____10 2351049 3552009 bytes 66.19%
id Compaction keyspace_1 table_____2 810635522 867307196 bytes 93.47%
id Compaction keyspace_1 table_____5 281573682 780375396 bytes 36.08%
id Compaction keyspace_1 table_____6 2350396501 2398745060 bytes 97.98%
id Compaction keyspace_1 table_____1 63122362 434443651 bytes 14.53%
id Compaction keyspace_1 table_____3 287859748 399896319 bytes 71.98%
id Compaction keyspace_1 table_____2 1776310557 2685522257 bytes 66.14%
id Compaction keyspace_1 table_____1 494183426 22432529013 bytes 2.20%
nodetool compactionhistory: 这里有很多行,但这里有一个样本:
id datatype index 1492056758751 558756 540336 {1:175, 2:6}
id datatype index 1492075503279 128269 114446 {1:1160, 2:31}
id datatype index 1492072165446 22914902 22464994 {1:626, 2:37}
id datatype index 1492060375419 73514456 72842367 {1:398795, 2:7294, 3:300}
id datatype index 1492075160893 85707 64387 {1:236, 2:41}
id datatype index 1492151303774 139172156 134666782 {1:9129, 2:3313, 3:935, 4:112}
id datatype index 1492135037619 30839157 29690968 {1:32854, 2:5702, 3:535, 4:61}
id datatype index 1492075521048 255030 253531 {1:220, 2:6}
id datatype index 1492116936213 11391100 10943344 {1:6798, 2:301}
id datatype index 1492075649703 1527580 1486442 {1:5381, 2:330}
id datatype index 1492153054713 218401839 216306589 {1:6669, 2:1068, 3:273, 4:22}
id datatype index 1492169550324 9172160 8724129 {1:42943, 2:2390}
id datatype index 1492087845445 8086487 7810261 {1:8445, 2:1209, 3:95}
id datatype index 1492116806390 837169 806946 {1:5984, 2:262}
id datatype index 1492167939189 275277987 271618327 {1:38585, 2:18745, 3:494}
id datatype index 277471932 266321389 {1:47184, 2:16047, 3:367, 4:468}
id datatype index 1492116559239 1569590 1402724 {1:460, 2:62}
id datatype index 1492173763782 83298080 81977056 {1:36383, 2:7577, 3:3565, 4:95, 6:169}
id datatype index 1492158247355 42660621 40224352 {1:6565, 2:987, 3:316, 4:521, 6:17, 8:70}
id datatype index 1492179061558 589874248 568901949 {1:16726, 2:9342, 3:1149, 4:141}
id datatype index 1492190014331 807975203 786973389 {1:67311, 2:1852}
id datatype index 1491949569125 45499223 46212100 {1:3944, 2:523, 3:1268, 4:262}
id datatype index 1492063798113 2401 1134 {1:1, 2:3}
id datatype index 1492100603829 7693737 7507021 {1:7112, 2:870, 3:235, 4:27}
id datatype index 1492202653921 114122963 111721885 {1:2038, 2:2997, 3:1095, 4:48, 5:40}
id datatype index 1492063653695 60700 50728 {1:157, 2:12}
id datatype index 1492152115922 165656033 159591156 {1:5180, 2:3233, 3:600, 4:564, 5:37, 6:14, 7:12}
id datatype index 1492160511587 3353867375 3280857307 {1:12265239, 2:409303, 3:16391, 4:1932}
id datatype index 1492116638632 3226315 2863672 {1:956, 2:137}
id datatype index 1492050334458 64407 56620 {1:447, 2:31}
id datatype index 1492150640640 587181 424081 {1:1293, 2:218, 3:1}
id datatype index 1492116731210 429668507 407404356 {1:2208562, 2:131875, 3:338}
id datatype index 1492134210449 293003702 275992426 {1:7429, 2:1686, 3:165}
id datatype index 1492171984560 8467649 8318775 {1:13330, 2:892, 3:11}
id datatype index 1492150632348 424314 368270 {1:356, 2:72, 3:8}
id datatype index 1492068676918 677842865 653983357 {1:11042, 2:405}
id datatype index 1492160695008 11985228 11689655 {1:3684, 2:1390, 3:441, 4:87}
id datatype index 5906438 5731218 {1:7040, 2:445, 3:27}
id datatype index 1492132529903 234019313 220261439 {1:80014, 2:5316}
id datatype index 1646302 1634070 {1:575, 2:17, 3:5}
id datatype index 1492145903652 1544764 1527844 {1:1807, 2:295, 3:65, 4:3, 5:6}
id datatype index 1492075180569 1034277 986605 {1:6591, 2:235}
id datatype index 1491928723944 5823014 5811907 {1:6498}
id datatype index 1492075323943 573147 526857 {1:4395, 2:250}
答案 0 :(得分:1)
您的新节点应该最终关闭压缩间隙......
CPU不是压缩中的唯一约束,请检查 compaction_throughput_mb_per_sec 参数,并查看以下文章: https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsConfigureCompaction.html
请检查您的nodetool compactionstats,看看待处理任务的数量是否会随着时间的推移而减少。另外,请在此附上 nodetool cfstats 的输出。
作为替代方案,您可以尝试重新添加新节点, auto_bootstrap off ,然后运行 nodetool rebuild ,并最近修复,它应该更快在你的情况下。
编辑:
查看compactionstats后 - 尝试将 concurrent_compactors 属性降低到较低的值。执行需要更多时间,但应该对集群整体性能影响较小。
答案 1 :(得分:0)
如果您注意到已完成交易的 bytes_in 和 bytes_out ,那么即使在完成如此多的压缩后您也看不到磁盘空间利用率发生巨大变化。
注意:如果它适合您的使用案例,您还应考虑使用 Leveled 压缩策略,因为它比尺寸分层具有许多优势。水平压实通常最适合大多数用例。这是一个很好的块,描述何时使用一个而不是其他。 http://www.datastax.com/dev/blog/when-to-use-leveled-compaction