我在通过Cassandra的python驱动程序进行“SELECT”查询时遇到ReadFailure错误。
我的Cassandra实例是一个本地节点。它是通过tar文件安装的,而不是将其作为服务安装。
我有一个名为“Documents”的键空间和一个包含两列(name和object)的表。 Name是text数据类型,object是blob数据类型。 blob对象是pickle python类实例。密钥空间/表的描述如下:
CREATE TABLE "Documents".table (
name text PRIMARY KEY,
object blob
) WITH bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
此表中包含3509行,每个对象大约为。 25kb的数据。 (所以我在对象列中估计~90Mb的数据。)我正在尝试运行一行简单的python cassandra代码:
rows = session.execute("SELECT name, object FROM table")
这个查询将在CQLSH中执行,但是当我在python中尝试它时,我得到一个cassandra.ReadFailure错误。这是具体的错误:
cassandra.ReadFailure: Error from server: code=1300 [Replica(s) failed to execute read] message="Operation failed - received 0 responses and 1 failures" info={'consistency': 'LOCAL_ONE', 'required_responses': 1, 'received_responses': 0, 'failures': 1}
在cassandra的日志文件中,这就是生成的内容:
WARN [ReadStage-4] 2018-02-13 14:53:12,319 AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread Thread[ReadStage-4,10,main]: {}
java.lang.RuntimeException: java.lang.RuntimeException
at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2598) ~[apache-cassandra-3.11.1.jar:3.11.1]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_151]
at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134) [apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) [apache-cassandra-3.11.1.jar:3.11.1]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_151]
Caused by: java.lang.RuntimeException: null
at org.apache.cassandra.io.util.DataOutputBuffer.validateReallocation(DataOutputBuffer.java:134) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.io.util.DataOutputBuffer.calculateNewSize(DataOutputBuffer.java:152) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.io.util.DataOutputBuffer.reallocate(DataOutputBuffer.java:159) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.io.util.DataOutputBuffer.doFlush(DataOutputBuffer.java:119) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:132) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:151) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.utils.ByteBufferUtil.writeWithVIntLength(ByteBufferUtil.java:296) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.db.marshal.AbstractType.writeValue(AbstractType.java:413) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.db.rows.Cell$Serializer.serialize(Cell.java:210) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$serializeRowBody$0(UnfilteredSerializer.java:248) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1242) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1197) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.db.rows.BTreeRow.apply(BTreeRow.java:172) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.db.rows.UnfilteredSerializer.serializeRowBody(UnfilteredSerializer.java:236) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:205) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:137) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:125) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:137) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:92) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:79) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:308) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:167) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:160) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:156) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:76) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:346) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1886) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2594) ~[apache-cassandra-3.11.1.jar:3.11.1]
... 5 common frames omitted
除了我在CQLSH中工作的查询外,我还可以在其他相同的表上执行python查询。唯一的区别是他们的行数更少,数据更少。
我试图将yaml文件中的超时值更改为无效,这是有道理的,因为这不是超时问题。我还修改了yaml文件中的tombstone_failure_threshold变量,但无济于事。
如何在不收到此错误的情况下对大型数据集执行查询。是否可以设置batch_size变量?此时的任何指导都会有所帮助。