Couchbase群集 - 1个节点关闭时的异常

时间:2013-12-30 14:22:43

标签: java couchbase

我有一个有3个节点的Couchbase集群。我有一个写入此集群的java进程。 当其中一个节点关闭时,我遇到了这个异常:

2013-12-30 14:03:23.259 WARN com.couchbase.client.CouchbaseConnection:  Node expected to receive data is inactive. This could be due to a failure within the cluster. Will check for updated configuration. Key without a configured node is: 980d330b2a96656a93bd08e48e6fc759135e6e6f.
Dec 30, 2013 2:03:23 PM com.couchbase.client.CouchbaseConnectionFactory resubConfigUpdate
INFO: Attempting to resubscribe for cluster config updates.
Dec 30, 2013 2:03:23 PM com.couchbase.client.CouchbaseConnectionFactory$Resubscriber run
INFO: Reconnect attempt 1, waiting 0ms
2013-12-30 14:03:24.020 INFO com.couchbase.client.CouchbaseConnection:  Reconnecting {QA sa=10.223.224.79/10.223.224.79:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0}
2013-12-30 14:03:24.261 WARN net.spy.memcached.protocol.binary.BinaryMemcachedNodeImpl:  Operation canceled because authentication or reconnection and authentication has taken more than one second to complete.
2013-12-30 14:03:24.263 WARN com.couchbase.client.CouchbaseConnection:  Node expected to receive data is inactive. This could be due to a failure within the cluster. Will check for updated configuration. Key without a configured node is: 980d330b2a96656a93bd08e48e6fc759135e6e6f.
2013-12-30 14:03:25.265 WARN net.spy.memcached.protocol.binary.BinaryMemcachedNodeImpl:  Operation canceled because authentication or reconnection and authentication has taken more than one second to complete.
441016302 [pool-5-thread-1] ERROR com.polimo.rtb.writer.CouchbaseWriter  - Error while trying to write to couchbase
java.util.concurrent.CancellationException: Cancelled
        at net.spy.memcached.internal.OperationFuture.get(OperationFuture.java:176)
        at net.spy.memcached.MemcachedClient.mutateWithDefault(MemcachedClient.java:1842)
        at net.spy.memcached.MemcachedClient.incr(MemcachedClient.java:1788)
        at com.polimo.rtb.writer.CouchbaseWriter.incrementCounterInDb(CouchbaseWriter.java:107)
        at com.polimo.rtb.writer.CouchbaseWriter.write(CouchbaseWriter.java:90)
        at com.polimo.rtb.consumer.ConsumerThread.run(ConsumerThread.java:40)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:724)

我正在使用standard java client并且根据所描述的行为,我觉得我错过了使用群集的全部意义,我的意思是,如果一个节点出现故障,一切都应该正常工作。

我是否缺少某些配置(服务器或客户端)?

2 个答案:

答案 0 :(得分:1)

复制是基于每个桶配置的。如果没有配置副本,那么当节点发生故障时,该节点托管的vBucket将不再可用,因此任何对该数据执行操作的尝试都会失败,如您所见。

有关配置副本的详细信息,请参阅Couchbase手册中的Creating and Editing Data Buckets(特别是“创建存储桶”过程的副本部分和Failover节点部分)。

答案 1 :(得分:0)

您的客户如何配置。您是否在配置中拥有所有群集节点的IP地址?如果配置中只有一个IP地址,并且此节点崩溃,则您的应用程序无法访问您的群集。