今天在节点d1r1n3上对14x节点dsc 2.1.15群集进行了一些扩展维护,但在群集的最大提示窗口内完成了。
将节点恢复到大多数其他节点之后'除了两个节点(d1r1n4和d1r1n7)之外,提示在几分钟内再次消失,其中只有部分提示消失了。
在显示1个活动的hintedhandoff任务几个小时后,我重新启动了节点d1r1n7,然后很快d1r1n4清空了它的提示表。
如何查看d1r1n7上存储的提示节点的目的地? 并且可能如何处理提示?
更新: 在使节点d1r1n3离线以便维护d1r1n7'之后找到对应于maxhint-end结尾的窗口。暗示消失了。让我们对这是否合适感到困惑。如果暗示处理好了,或者在maxhint窗口结束后有些过期了吗? 如果后者需要在节点d1r1n3之后运行修复(这需要相当长的时间和IO ...:/)如果我们现在应用read [LOCAL] QUORUM而不是当前读取的那个怎么办?有一个DC和RF = 3,这可能会在需要的基础上触发读取路径修复并且可能需要我们进行全面维修吗?
答案:结果是这两个节点上的hinted_handoff_throttle_in_kb是@ default 1024,而其余的群集是@ 65536:)
答案 0 :(得分:1)
提示存储在system.hints表中的cassandra 2.1.15中
cqlsh> describe table system.hints;
CREATE TABLE system.hints (
target_id uuid,
hint_id timeuuid,
message_version int,
mutation blob,
PRIMARY KEY (target_id, hint_id, message_version)
) WITH COMPACT STORAGE
AND CLUSTERING ORDER BY (hint_id ASC, message_version ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = 'hints awaiting delivery'
AND compaction = {'enabled': 'false', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = 0
AND gc_grace_seconds = 0
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 3600000
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
target_id 与节点ID
相关联例如
在我的样本2节点集群中,RF = 2
cqlsh> describe table system.hints;
CREATE TABLE system.hints (
target_id uuid,
hint_id timeuuid,
message_version int,
mutation blob,
PRIMARY KEY (target_id, hint_id, message_version)
) WITH COMPACT STORAGE
AND CLUSTERING ORDER BY (hint_id ASC, message_version ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = 'hints awaiting delivery'
AND compaction = {'enabled': 'false', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = 0
AND gc_grace_seconds = 0
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 3600000
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
我在node2关闭时执行了以下操作
nodetool status
Datacenter: datacenter1
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 127.0.0.1 71.47 KB 256 100.0% d00c4b10-2997-4411-9fc9-f6d9f6077916 rack1
DN 127.0.0.2 75.4 KB 256 100.0% 1ca6779d-fb41-4a26-8fa8-89c6b51d0bfa rack1
可以看出 system.hints.target_id 与nodetool状态中的主机ID 相关联(1ca6779d-fb41-4a26-8fa8-89c6b51d0bfa)