范围查询期间的Cassandra OutOfMemoryError

时间:2013-12-12 01:01:59

标签: cassandra

我有一个包含1MB blob的表。

CREATE TABLE blobs_1(   关键文字,   版本bigint,   chunk int,   object_blob blob,   object_size int,   PRIMARY KEY(键,版本,块) )

每个吊球分布在约100个大块上。 以下查询导致OutOfMemory错误:

从blobs_1中选择object_size,其中key ='key1'且version = 1;

这是错误:

java.lang.OutOfMemoryError:Java堆空间         在org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:344)         在org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392)         在org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:355)         在org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:124)         at org.apache.cassandra.db.OnDiskAtom $ Serializer.deserializeFromSSTable(OnDiskAtom.java:85)         在org.apache.cassandra.db.Column $ 1.computeNext(Column.java:75)         在org.apache.cassandra.db.Column $ 1.computeNext(Column.java:64)         在com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)         在com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)         在org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:88)         在org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:37)         在com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)         在com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)         在org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:82)         在org.apache.cassandra.db.columniterator.LazyColumnIterator.computeNext(LazyColumnIterator.java:82)         在org.apache.cassandra.db.columniterator.LazyColumnIterator.computeNext(LazyColumnIterator.java:59)         在com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)         在com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)         在org.apache.cassandra.db.filter.QueryFilter $ 2.getNext(QueryFilter.java:157)         在org.apache.cassandra.db.filter.QueryFilter $ 2.hasNext(QueryFilter.java:140)         在org.apache.cassandra.utils.MergeIterator $ Candidate.advance(MergeIterator.java:144)         在org.apache.cassandra.utils.MergeIterator $ ManyToOne.advance(MergeIterator.java:123)         在org.apache.cassandra.utils.MergeIterator $ ManyToOne.computeNext(MergeIterator.java:97)         在com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)         在com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)         在org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:185)         在org.apache.cassandra.db.filter.QueryFilter.collat​​eColumns(QueryFilter.java:122)         在org.apache.cassandra.db.filter.QueryFilter.collat​​eOnDiskAtom(QueryFilter.java:80)         在org.apache.cassandra.db.RowIteratorFactory $ 2.getReduced(RowIteratorFactory.java:101)         在org.apache.cassandra.db.RowIteratorFactory $ 2.getReduced(RowIteratorFactory.java:75)         在org.apache.cassandra.utils.MergeIterator $ ManyToOne.consume(MergeIterator.java:115)         在org.apache.cassandra.utils.MergeIterator $ ManyToOne.computeNext(MergeIterator.java:98)

2 个答案:

答案 0 :(得分:2)

您需要减少页面大小。默认分页大小适用于普通的小列/行。对于大blob,您需要缩小分页大小。

https://github.com/datastax/java-driver/blob/2.0/driver-core/src/main/java/com/datastax/driver/core/Statement.java#L234

答案 1 :(得分:0)

发生错误是因为Cassandra在读取表的单个列时反序列化了超过必要的数据(至少,Cassandra 1.2,这可能在2.0分支中得到了改进)。

要解决此问题,您可以为元数据(大小等)引入单独的表。它会减慢写入速度,但会极大地提高读取性能。