Cassandra 2.1 OutOfMemory

时间:2017-06-11 21:07:02

标签: java cassandra heap thrift cassandra-2.1

我有一个带64G内存的Cassandra节点(16GB堆,G1GC),偶尔会发生这种情况:

ERROR [Thrift:74] 2017-06-11 13:20:25,710 CassandraDaemon.java:229 - Exception in thread Thread[Thrift:74,5,main]
java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOf(Arrays.java:3236) ~[na:1.8.0_45]
        at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:118) ~[na:1.8.0_45]
        at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93) ~[na:1.8.0_45]
        at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153) ~[na:1.8.0_45]
        at org.apache.thrift.transport.TFramedTransport.write(TFramedTransport.java:146) ~[libthrift-0.9.2.jar:0.9.2]
        at org.apache.thrift.protocol.TBinaryProtocol.writeBinary(TBinaryProtocol.java:211) ~[libthrift-0.9.2.jar:0.9.2]
        at org.apache.cassandra.thrift.Column$ColumnStandardScheme.write(Column.java:678) ~[apache-cassandra-thrift-2.1.13.jar:2.1.13]
        at org.apache.cassandra.thrift.Column$ColumnStandardScheme.write(Column.java:611) ~[apache-cassandra-thrift-2.1.13.jar:2.1.13]
        at org.apache.cassandra.thrift.Column.write(Column.java:538) ~[apache-cassandra-thrift-2.1.13.jar:2.1.13]
        at org.apache.cassandra.thrift.ColumnOrSuperColumn$ColumnOrSuperColumnStandardScheme.write(ColumnOrSuperColumn.java:673) ~[apache-cassandra-thrift-2.1.13.jar:2.1.13]
        at org.apache.cassandra.thrift.ColumnOrSuperColumn$ColumnOrSuperColumnStandardScheme.write(ColumnOrSuperColumn.java:607) ~[apache-cassandra-thrift-2.1.13.jar:2.1.13]
        at org.apache.cassandra.thrift.ColumnOrSuperColumn.write(ColumnOrSuperColumn.java:517) ~[apache-cassandra-thrift-2.1.13.jar:2.1.13]
        at org.apache.cassandra.thrift.Cassandra$multiget_slice_result$multiget_slice_resultStandardScheme.write(Cassandra.java:14729) ~[apache-cassandra-thrift-2.1.13.jar:2.1.13]
        at org.apache.cassandra.thrift.Cassandra$multiget_slice_result$multiget_slice_resultStandardScheme.write(Cassandra.java:14633) ~[apache-cassandra-thrift-2.1.13.jar:2.1.13]
        at org.apache.cassandra.thrift.Cassandra$multiget_slice_result.write(Cassandra.java:14563) ~[apache-cassandra-thrift-2.1.13.jar:2.1.13]
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53) ~[libthrift-0.9.2.jar:0.9.2]
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) ~[libthrift-0.9.2.jar:0.9.2]
        at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:205) ~[apache-cassandra-2.1.13.jar:2.1.13]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_45]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[na:1.8.0_45]
        at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_45]

现在看来增加我的堆似乎是一个明显的信息,但我不愿意在不知道原因的情况下这样做。

看起来它正在尝试写入,并且堆上的空间不足。这是崩溃时节点的jconsole视图:

jconsole_output

我们正在使用kairosdb并且该节点显示大约5k次写入/秒(kairsdb.datastore.write_size)

与此同时,我已将CASSANDRA_HEAPDUMP_DIR设置为允许写入的地方,以便我可以进一步了解。

一些配置变量:

  • key_cache_size_in_mb:256
  • row_cache_size_in_mb:0
  • concurrent_reads:128
  • concurrent_writes:64
  • concurrent_counter_writes:64
  • memtable_allocation_type:offheap_objects
  • concurrent_compactors:1
  • compaction_throughput_mb_per_sec:48

任何想法/建议/指示?

谢谢!

编辑: 另一个节点死了,这个堆输出,指向JMX ?? screenshot

0 个答案:

没有答案