在进行批量加载数据时,根据日志数据递增计数器,我遇到超时异常。我正在使用Datastax 2.0-rc2 java驱动程序。
这是服务器无法跟上的问题(即服务器端配置问题),还是客户端厌倦等待服务器响应的问题?无论哪种方式,我可以做一个简单的配置更改来解决这个问题吗?
Exception in thread "main" com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during write query at consistency ONE (1 replica were required but only 0 acknowledged the write)
at com.datastax.driver.core.exceptions.WriteTimeoutException.copy(WriteTimeoutException.java:54)
at com.datastax.driver.core.ResultSetFuture.extractCauseFromExecutionException(ResultSetFuture.java:271)
at com.datastax.driver.core.ResultSetFuture.getUninterruptibly(ResultSetFuture.java:187)
at com.datastax.driver.core.Session.execute(Session.java:126)
at jason.Stats.analyseLogMessages(Stats.java:91)
at jason.Stats.main(Stats.java:48)
Caused by: com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during write query at consistency ONE (1 replica were required but only 0 acknowledged the write)
at com.datastax.driver.core.exceptions.WriteTimeoutException.copy(WriteTimeoutException.java:54)
at com.datastax.driver.core.Responses$Error.asException(Responses.java:92)
at com.datastax.driver.core.ResultSetFuture$ResponseCallback.onSet(ResultSetFuture.java:122)
at com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:224)
at com.datastax.driver.core.RequestHandler.onSet(RequestHandler.java:373)
at com.datastax.driver.core.Connection$Dispatcher.messageReceived(Connection.java:510)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during write query at consistency ONE (1 replica were required but only 0 acknowledged the write)
at com.datastax.driver.core.Responses$Error$1.decode(Responses.java:53)
at com.datastax.driver.core.Responses$Error$1.decode(Responses.java:33)
at com.datastax.driver.core.Message$ProtocolDecoder.decode(Message.java:165)
at org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:66)
... 21 more
其中一个节点大致在发生时报告:
ERROR [Native-Transport-Requests:12539] 2014-02-16 23:37:22,191 ErrorMessage.java (line 222) Unexpected exception during request
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(Unknown Source)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source)
at sun.nio.ch.IOUtil.read(Unknown Source)
at sun.nio.ch.SocketChannelImpl.read(Unknown Source)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:64)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
答案 0 :(得分:31)
虽然我不明白这个问题的根本原因,但我能够通过增加conf / cassandra.yaml文件中的超时值来解决问题。
write_request_timeout_in_ms: 20000
答案 1 :(得分:20)
我们在连接了SAN存储的ESX群集中的单个节点上遇到了类似的问题(not recommended by datastax,但此时我们没有其他选项)。
注意: 以下设置可能对Cassandra可以实现的最高性能造成重大打击,但我们选择了稳定的系统而不是高性能。
在运行iostat -xmt 1
时,我们发现了WriteTimeoutExceptions同时发生的高w_await时间。事实证明,memtable无法在默认的write_request_timeout_in_ms: 2000
设置中写入磁盘。
我们将memtable大小从512Mb(默认为堆空间的25%,在我们的情况下是2Gb)显着减少到32Mb:
# Total permitted memory to use for memtables. Cassandra will stop
# accepting writes when the limit is exceeded until a flush completes,
# and will trigger a flush based on memtable_cleanup_threshold
# If omitted, Cassandra will set both to 1/4 the size of the heap.
# memtable_heap_space_in_mb: 2048
memtable_offheap_space_in_mb: 32
我们还略微将写入超时时间增加到3秒:
write_request_timeout_in_ms: 3000
如果IO等待时间过长,请确保定期写入磁盘:
#commitlog_sync: batch
#commitlog_sync_batch_window_in_ms: 2
#
# the other option is "periodic" where writes may be acked immediately
# and the CommitLog is simply synced every commitlog_sync_period_in_ms
# milliseconds.
commitlog_sync: periodic
commitlog_sync_period_in_ms: 10000
这些设置允许记忆保持较小并经常书写。异常得到了解决,我们在系统上运行的压力测试中幸存下来。
答案 2 :(得分:0)
协调员(所以服务器)超时等待写入的确认。
答案 3 :(得分:-1)
值得仔细检查Cassandra的GC设置。
在我的情况下,我使用信号量来限制异步写入并且仍然(有时)获得超时。
我发现我使用的是不合适的GC设置,为了方便起见,我一直在使用cassandra-unit,这会导致使用默认的VM设置运行时出乎意料。因此,我们最终会触发一个停止世界的GC,从而导致写入超时。应用与我正在运行的cassandra docker图像相同的GC设置,一切都很好。
这可能是一个不寻常的原因,但它会帮助我,所以它似乎值得在这里录音。