我在10节点集群中使用cassandra v2.1.13,复制因子= 2和LeveledCompactionStrategy。节点是c4.4x大型,堆10g。
群集中的8个节点运行良好,但群集中有2个特定节点存在问题。这些节点不断提供非常差的读取延迟,并且读取计数丢失率很高。这两个节点上的操作系统负载持续很高。
以下是来自其中一个节点的nodetool命令结果:
状态
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.0.23.37 257.31 GB 256 19.0% 3e9ee62e-70a2-4b2e-ba10-290a62cd055b 1
UN 10.0.53.69 300.24 GB 256 20.5% 48988162-69d6-4698-9afa-799ef4be7bbc 2
UN 10.0.23.133 342.37 GB 256 21.1% 30431a62-0cf6-4c82-8af1-e9ba0025eba6 1
UN 10.0.53.7 348.52 GB 256 21.4% 5fcdeb25-e1e5-47f6-af7f-7bea825ab3c0 2
UN 10.0.53.88 292.59 GB 256 19.5% c77904bc-10a8-49e0-b6fa-8fe8126e064c 2
UN 10.0.53.250 272.76 GB 256 20.6% ecf417f2-2e96-4b9e-bb15-06eaf948cefa 2
UN 10.0.23.75 271.24 GB 256 20.8% d8b0ab1b-65ab-46cd-b7e4-3fb3861ffb23 1
UN 10.0.23.253 302.9 GB 256 21.0% 4bb6408a-9aa0-42da-96f7-dbe0dad757bc 1
UN 10.0.23.238 326.35 GB 256 18.2% 55e33a97-e5ca-4c48-a530-a0ff6fa8edde 1
UN 10.0.53.222 247.4 GB 256 18.0% c3a6e4c2-7ab6-4f3a-a444-8d6dff2beb43 2
cfstats
Keyspace: key_space_name
Read Count: 63815118
Read Latency: 88.71912845022085 ms.
Write Count: 40802728
Write Latency: 1.1299861338192878 ms.
Pending Flushes: 0
Table: table1
SSTable count: 1269
SSTables in each level: [1, 10, 103/100, 1023/1000, 131, 0, 0, 0, 0]
Space used (live): 274263401275
Space used (total): 274263401275
Space used by snapshots (total): 0
Off heap memory used (total): 1776960519
SSTable Compression Ratio: 0.3146938954387242
Number of keys (estimate): 1472406972
Memtable cell count: 3840240
Memtable data size: 96356032
Memtable off heap memory used: 169478569
Memtable switch count: 149
Local read count: 47459095
Local read latency: 0.328 ms
Local write count: 14501016
Local write latency: 0.695 ms
Pending flushes: 0
Bloom filter false positives: 186
Bloom filter false ratio: 0.00000
Bloom filter space used: 1032396536
Bloom filter off heap memory used: 1032386384
Index summary off heap memory used: 495040742
Compression metadata off heap memory used: 80054824
Compacted partition minimum bytes: 216
Compacted partition maximum bytes: 3973
Compacted partition mean bytes: 465
Average live cells per slice (last five minutes): 0.1710125211397823
Maximum live cells per slice (last five minutes): 1.0
Average tombstones per slice (last five minutes): 0.0
Maximum tombstones per slice (last five minutes): 0.0
Table: table2
SSTable count: 93
SSTables in each level: [1, 10, 82, 0, 0, 0, 0, 0, 0]
Space used (live): 18134115541
Space used (total): 18134115541
Space used by snapshots (total): 0
Off heap memory used (total): 639297085
SSTable Compression Ratio: 0.2889927549599339
Number of keys (estimate): 102804187
Memtable cell count: 409595
Memtable data size: 492365311
Memtable off heap memory used: 529339207
Memtable switch count: 433
Local read count: 16357463
Local read latency: 345.194 ms
Local write count: 26302779
Local write latency: 1.370 ms
Pending flushes: 0
Bloom filter false positives: 4
Bloom filter false ratio: 0.00000
Bloom filter space used: 73133360
Bloom filter off heap memory used: 73132616
Index summary off heap memory used: 30985070
Compression metadata off heap memory used: 5840192
Compacted partition minimum bytes: 125
Compacted partition maximum bytes: 24601
Compacted partition mean bytes: 474
Average live cells per slice (last five minutes): 0.9915609172937249
Maximum live cells per slice (last five minutes): 1.0
Average tombstones per slice (last five minutes): 0.0
Maximum tombstones per slice (last five minutes): 0.0
tpstats
Pool Name Active Pending Completed Blocked All time blocked
MutationStage 2 0 43617272 0 0
ReadStage 45 189 64039921 0 0
RequestResponseStage 0 0 37790267 0 0
ReadRepairStage 0 0 3590974 0 0
CounterMutationStage 0 0 0 0 0
MiscStage 0 0 0 0 0
HintedHandoff 0 0 18 0 0
GossipStage 0 0 457469 0 0
CacheCleanupExecutor 0 0 0 0 0
InternalResponseStage 0 0 0 0 0
CommitLogArchiver 0 0 0 0 0
CompactionExecutor 1 1 60965 0 0
ValidationExecutor 0 0 646 0 0
MigrationStage 0 0 0 0 0
AntiEntropyStage 0 0 1938 0 0
PendingRangeCalculator 0 0 17 0 0
Sampler 0 0 0 0 0
MemtableFlushWriter 0 0 1997 0 0
MemtablePostFlush 0 0 4884 0 0
MemtableReclaimMemory 0 0 1997 0 0
Native-Transport-Requests 11 0 61321377 0 0
Message type Dropped
READ 56788
RANGE_SLICE 0
_TRACE 0
MUTATION 182
COUNTER_MUTATION 0
BINARY 0
REQUEST_RESPONSE 0
PAGED_RANGE 0
READ_REPAIR 1
netstats
Mode: NORMAL
Not sending any streams.
Read Repair Statistics:
Attempted: 2910144
Mismatch (Blocking): 0
Mismatch (Background): 229431
Pool Name Active Pending Completed
Commands n/a 0 37851777
Responses n/a 0 101128958
compactionstats
pending tasks: 0
我确认没有连续的压缩运行。 在两个节点上获取jstack时,有些线程在无限循环中不断调用后续堆栈 - 我的猜测是高os负载是由于这些线程被卡在循环中而导致读取速度慢。
SharedPool-Worker-7 - priority:5 - threadId:0x00002bb0356a0150 - nativeId:0x158ef - state:RUNNABLE
stackTrace:
java.lang.Thread.State: RUNNABLE
at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
at org.apache.cassandra.utils.memory.HeapAllocator.allocate(HeapAllocator.java:34)
at org.apache.cassandra.utils.memory.AbstractAllocator.clone(AbstractAllocator.java:34)
at org.apache.cassandra.db.NativeCell.localCopy(NativeCell.java:58)
at org.apache.cassandra.db.CollationController$2.apply(CollationController.java:223)
at org.apache.cassandra.db.CollationController$2.apply(CollationController.java:220)
at com.google.common.collect.Iterators$8.transform(Iterators.java:794)
at com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48)
at org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:175)
at org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:156)
at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:146)
at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:125)
at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:99)
at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:264)
at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:108)
at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:82)
at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:69)
at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:314)
at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:2001)
at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1844)
at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:353)
at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:85)
at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:47)
at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
at java.lang.Thread.run(Thread.java:748)
我们尝试增加受影响节点的实例大小,怀疑它们可能在I / O操作系统上被阻止但是没有帮助。
有人可以帮我弄清楚群集中这2个节点的问题。