我们有一个7节点集群在Cassandra 3.9上工作。目前我们正面临3个节点随机崩溃的问题,日志中没有任何注释。 kernel.log和Cassandra的system.log / debug.log都没有包含节点崩溃原因的任何提示。
我们唯一看到的是堆积压实作业,这些作业从未计算过:
pending tasks: 139
- chatlog.master_id: 1
- chatlog.reports: 1
- qsg.stats: 21
- hardcore.stats_archive: 1
- hardcore.stats: 1
- quickbedwars.stats_archive: 1
- quickbedwars.stats: 1
- permission.entity: 1
- pvp.invsave: 1
- pvp.u_match_count: 1
- conquest.leavebuster: 1
- molecraft.stats_archive: 1
- molecraft.achievement: 1
- clanwar.matches: 1
- clanwar.recent_matches_clans: 1
- network.version_logins: 1
- skywars.stats: 60
- system_traces.sessions: 1
- cores.stats: 1
- system.local: 1
- endergames.stats_archive: 1
- endergames.stats: 1
- endergames.kits: 1
- gungame.stats: 1
- system_schema.dropped_columns: 1
- system_schema.types: 1
- system_schema.tables: 1
- system_schema.indexes: 1
- system_schema.keyspaces: 1
- system_schema.columns: 1
- bedwars.stats: 26
- survivalgames.stats_archive: 1
- coinsystem.rarity_probabilities: 1
- nicksystem.texture_cache: 1
- speed_suhc.stats_archive: 1
所有统计表都是Leveld休息是SCTS
当节点崩溃时,nodetool状态如下所示:
UN 79.133.61.x 73.63 GiB 256 ? e5d40b71-91aa-4fb4-a883-163dfd4ddc1e 01
UN 79.133.61.x 89.85 GiB 256 ? 6f74c956-8300-4f83-82d4-eb31d5678696 01
UN 79.133.61.x 113.33 GiB 512 ? e73560f5-d6e7-4504-9a71-52f357b3ce0f 01
UJ 79.133.61.x 29.94 GiB 512 ? b6ff57be-393a-4cbd-b92f-3b04ac5132da 01
UN 79.133.61.x 68.11 GiB 256 ? b9e6207e-987f-4393-bdfe-5c440e886500 01
UN 79.133.61.x 76.11 GiB 256 ? 455d53f6-23dd-45ba-9a29-9fc7b0303fc2 01
DN 79.133.61.x 95.99 GiB 256 ? f2d3ecb1-4610-414b-b5c6-3129d96e4bc0 01
我们耗尽了当前加入的一个节点,并对该节点进行了完全重置以重新开始。
我们尝试将compaction_throughput设置得更高,但这根本没有帮助。我们在JBOD中使用4个SSD,在Baremetal服务器上使用12个核心(HT)CPU连接内部10 Gbit / s接口。我们从未遇到任何资源限制,I / O看起来没问题,内存似乎很好,CPU也没有超载。
//编辑:
具有堆叠压缩任务的一个节点的htop屏幕显示所有并发压缩线程正在处理某些事情:
Dstat输出如下所示:
所以它似乎时不时地写出一些数据块。但是没有什么我会考虑压缩的写入负载。
我目前发现节点崩溃之前的最后一次日志输出总是这样:
INFO [IndexSummaryManager:1] 2016-11-22 14:59:38,532 IndexSummaryRedistribution.java:75 - Redistributing index summaries
//更多编辑:
运行SJK显示来自压缩线程的极端内存分配:
heap allocation rate 5984mb/s
[000055] user=93.91% sys= 1.20% alloc= 1292mb/s - CompactionExecutor:2
[000056] user=94.85% sys= 0.21% alloc= 1142mb/s - CompactionExecutor:3
[000057] user=94.85% sys=-0.01% alloc= 1135mb/s - CompactionExecutor:4
[000054] user=97.10% sys=-0.03% alloc= 1084mb/s - CompactionExecutor:1
GC日志显示正常数字(使用GCG1和26GB堆,因为Als Cassandra调整建议尝试):
[Parallel Time: 8.9 ms, GC Workers: 10]
[GC Worker Start (ms): Min: 2976069.1, Avg: 2976069.2, Max: 2976069.2, Diff: 0.1]
[Ext Root Scanning (ms): Min: 2.4, Avg: 2.4, Max: 2.5, Diff: 0.1, Sum: 24.4]
[Update RS (ms): Min: 4.1, Avg: 4.3, Max: 4.5, Diff: 0.4, Sum: 42.8]
[Processed Buffers: Min: 8, Avg: 16.2, Max: 33, Diff: 25, Sum: 162]
[Scan RS (ms): Min: 0.1, Avg: 0.1, Max: 0.3, Diff: 0.2, Sum: 1.4]
[Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
[Object Copy (ms): Min: 1.8, Avg: 1.9, Max: 2.0, Diff: 0.2, Sum: 19.1]
[Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
[Termination Attempts: Min: 4, Avg: 6.0, Max: 8, Diff: 4, Sum: 60]
[GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.6]
[GC Worker Total (ms): Min: 8.8, Avg: 8.8, Max: 8.9, Diff: 0.1, Sum: 88.4]
[GC Worker End (ms): Min: 2976078.0, Avg: 2976078.0, Max: 2976078.0, Diff: 0.0]
[Code Root Fixup: 0.1 ms]
[Code Root Purge: 0.0 ms]
[Clear CT: 0.4 ms]
[Other: 1.4 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 0.1 ms]
[Ref Enq: 0.0 ms]
[Redirty Cards: 0.1 ms]
[Humongous Register: 0.1 ms]
[Humongous Reclaim: 0.0 ms]
[Free CSet: 0.8 ms]
[Eden: 4904.0M(4904.0M)->0.0B(4904.0M) Survivors: 8192.0K->8192.0K Heap: 6880.3M(8192.0M)->1976.7M(8192.0M)]