请帮助我理解我错过了什么。 我在 SELECT 上看到一个群集节点的奇怪行为,其中包含 LIMIT 和 ORDER BY DESC 子句:
SELECT cid FROM test_cf WHERE uid = 0x50236b6de695baa1140004bf ORDER BY tuuid DESC LIMIT 1000;
跟踪(仅限部分):
...
将REQUEST_RESPONSE消息发送到/10.0.25.56 [MessagingService-Outgoing- / 10.0.25.56] | 2016-02-29 22:17:25.117000 | 10.0.23.15 | 7862
将REQUEST_RESPONSE消息发送到/10.0.25.56 [MessagingService-Outgoing- / 10.0.25.56] | 2016-02-29 22:17:25.136000 | 10.0.25.57 | 6283
将REQUEST_RESPONSE消息发送到/10.0.25.56 [MessagingService-Outgoing- / 10.0.25.56] | 2016-02-29 22:17:38.568000 | 10.0.24.51 | 457931个
......
10.0.25.56 - 协调员节点
10.0.23.15, 10.0.24.51 ,10.0.25.57 - 包含数据的节点
协调员从其他节点获得的响应时间为10.0.24.51 13秒! 为什么会这样?我该如何解决?
分区键的行数(uid = 0x50236b6de695baa1140004bf)约为300。
如果我们使用 ORDER BY ASC (我们的聚类顺序)或 LIMIT 值小于此分区键的行数,则一切正常。
Cassandra(v2.2.5)群集包含25个节点。 每个节点都拥有大约400Gb的数据。
群集放置在AWS中。节点均匀分布在VPC中的3个子网中。节点的实例类型为c3.4xlarge(16个CPU内核,30GB RAM)。我们使用EBS支持的存储(1TB GP SSD)。
Keyspace RF等于3。
列系列:
CREATE TABLE test_cf (
uid blob,
tuuid timeuuid,
cid text,
cuid blob,
PRIMARY KEY (uid, tuuid)
) WITH CLUSTERING ORDER BY (tuuid ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction ={'class':'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
AND compression ={'sstable_compression':'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 86400
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
nodetool gcstats(10.0.25.57):
Interval (ms) Max GC Elapsed (ms)Total GC Elapsed (ms)Stdev GC Elapsed (ms) GC Reclaimed (MB) Collections Direct Memory Bytes
1208504 368 4559 73 553798792712 58 305691840
nodetool gcstats(10.0.23.15):
Interval (ms) Max GC Elapsed (ms)Total GC Elapsed (ms)Stdev GC Elapsed (ms) GC Reclaimed (MB) Collections Direct Memory Bytes
1445602 369 3120 57 381929718000 38 277907601
nodetool gcstats(10.0.24.51):
Interval (ms) Max GC Elapsed (ms)Total GC Elapsed (ms)Stdev GC Elapsed (ms) GC Reclaimed (MB) Collections Direct Memory Bytes
1174966 397 4137 69 1900387479552 45 304448986
答案 0 :(得分:0)
这可能是由于许多因素都与Cassandra有关而与之无关。
非Cassandra特定
Cassandra Specific
nodetool gcstats
以进行比较。 nodetool compactionhistory
除了一般的Linux故障排除之外,我建议您使用nodetool比较一些特定的C *功能并查找差异:
https://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsNodetool_r.html