Question

我们一直在努力将节点添加到现有的Cassandra 2.0.11集群中。从添加节点开始，我计算出该节点将堆积来自集群中其他节点的约616GB数据。数据流运行良好。现在已经堆积了618 GB的数据，已经超过24小时才使该节点处于UJ状态。

Nodetool状态：

UJ  xx.xxx.xxx.xxx  621.87 GB  256     ?      54a80605-9209-4e1c-9e61-3250e4458797  RAC1

Nodetool netstats：

自最近24小时以来，加入节点似乎从另一个现有节点yy.yyy.yy.yy流式传输346个文件，但system.logs没有显示任何流式传输活动

-bash-4.1$ nodetool netstats | grep -v 100%
Mode: JOINING
Bootstrap 365e5830-cac9-11e8-bea5-fdb60ab1bc98
 /yy.yyy.yy.yy
        Receiving 346 files, 228835212751 bytes total

阅读维修统计信息：

Attempted: 0
Mismatch (Blocking): 0
Mismatch (Background): 0
Pool Name                    Active   Pending      Completed
Commands                        n/a         0             16
Responses                       n/a         0      202529576

当前，我们已在system.log中启用DEBUG来捕获是否有卡住的情况。

当前的system.log：

 DEBUG [MutationStage:208] 2018-10-10 10:51:48,657 AbstractSimplePerColumnSecondaryIndex.java (line 111) applying index row xxxxxxxxxxxxxxxxxxxx in ColumnFamily(abcdefgh_ijklmno.pqr_stu_vwxyz [38643266636162312d643430312d343431652d623535652d333664376531626331343066:187181334195211023047137388  16203108:false:0@1539148904830000,])

我一直在试图弄清cassandra到底在做什么。 AbstractSimplePerColumnSecondaryIndex.java的确切作用。

一直在寻找我在下面找到的源代码 AbstractSimplePerColumnSecondaryIndex.java说明：

使用第二个列族（其中行键是索引值，列名是基本行键）为列族实现二级索引。

如果我按照外行术语推论这些日志表明正在建立一个索引（secondary_index）。许多表的确有secondaryindex。我仍然可以说这不是问题，因为我们在nodetool compactionstats中什么也没看到：

-bash-4.1$ nodetool compactionstats
pending tasks: 0
Active compaction remaining time :        n/a

请在以下三件事上提供帮助：

对于这种情况，理想的解决方法是加入的节点长期处于挂起状态。我在网上（Cassandra node can't complete joining operation）找到了它。建议是否必须这样做
在这些情况下，分析线程转储将有助于找出实际上被卡住的线程。
也请提供有关system.log现在正在发生的事情的帮助。

Cassandra 2.0.11节点引导程序以UJ状态挂起

0 个答案: