在退役的最后阶段流错误

时间:2014-06-02 06:30:39

标签: cassandra datastax-enterprise datastax

我在节点退役的最后阶段遇到以下异常:

Exception in thread "main" java.lang.RuntimeException: java.util.concurrent.ExecutionException: org.apache.cassandra.streaming.StreamException: Stream failed
    at org.apache.cassandra.service.StorageService.unbootstrap(StorageService.java:2946)
    at org.apache.cassandra.service.StorageService.decommission(StorageService.java:2903)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)
    at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)
    at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
    at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
    at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
    at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
    at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
    at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
    at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
    at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487)
    at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97)
    at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328)
    at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420)
    at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848)
    at sun.reflect.GeneratedMethodAccessor25.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
    at sun.rmi.transport.Transport$1.run(Transport.java:177)
    at sun.rmi.transport.Transport$1.run(Transport.java:174)
    at java.security.AccessController.doPrivileged(Native Method)
    at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
    at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556)
    at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:811)
    at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:670)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.ExecutionException: org.apache.cassandra.streaming.StreamException: Stream failed
    at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
    at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
    at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
    at org.apache.cassandra.service.StorageService.unbootstrap(StorageService.java:2941)
    ... 36 more
Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
    at org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
    at com.google.common.util.concurrent.Futures$4.run(Futures.java:1160)
    at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
    at com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
    at com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
    at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
    at org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:216)
    at org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:191)
    at org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:331)
    at org.apache.cassandra.streaming.StreamSession.convict(StreamSession.java:600)
    at org.apache.cassandra.gms.FailureDetector.interpret(FailureDetector.java:237)
    at org.apache.cassandra.gms.Gossiper.doStatusCheck(Gossiper.java:643)
    at org.apache.cassandra.gms.Gossiper.access$700(Gossiper.java:64)
    at org.apache.cassandra.gms.Gossiper$GossipTask.run(Gossiper.java:170)
    at org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:75)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
    ... 3 more

此异常跟踪来自 nodetool decommission 命令本身。退役节点的system.log

中没有记录任何异常

在接收节点中,例外情况如下:

 INFO [NonPeriodicTasks:1] 2014-06-02 04:40:53,101 SecondaryIndexManager.java (line 146) Index build of [myks.mycf] complete
ERROR [NonPeriodicTasks:1] 2014-06-02 04:40:53,240 CassandraDaemon.java (line 198) Exception in thread Thread[NonPeriodicTasks:1,5,main]
java.lang.RuntimeException: Outgoing stream handler has been closed
    at org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:170)
    at org.apache.cassandra.streaming.StreamSession.maybeCompleted(StreamSession.java:620)
    at org.apache.cassandra.streaming.StreamSession.taskCompleted(StreamSession.java:566)
    at org.apache.cassandra.streaming.StreamReceiveTask$OnCompletionRunnable.run(StreamReceiveTask.java:120)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

简短背景

我的Solr DC中有4个节点,包括我要退役的节点。其余三个节点中的两个已经在几天前完成了流和索引构建;几个小时前,最后一个(最慢的)完成了它的索引构建。如上面节点的日志所示,异常发生在索引构建完成之后。

问题

我可以假设节点已经成功退役,尽管例外吗?据我所知,最后一个节点应该接收的所有文件都已成功传输( nodetool netstats 在发生异常之前显示所有文件的100%)。我认为流错误只与关闭流会话失败有关 - 因为我注意到即使流已在几天前完成,会话仍保持打开,直到索引构建完成(证明netstats仍然是在索引构建运行的几天内显示一些输出)。我需要有人确认这是否正确以及我是否可以安全地删除已停用的节点中的数据文件。

其他一些信息

  1. DSE 4.0.3(Cassandra 2.0.7)
  2. 启用了Vnodes
  3. CentOS 6 x86_64
  4. nodetool status nodetool gossipinfo 仍将退役节点显示为“LEAVING”

1 个答案:

答案 0 :(得分:0)

我认为你遇到了2.0天内出现的众多流媒体错误之一。甚至可能是这个:

https://issues.apache.org/jira/browse/CASSANDRA-8343

如果那不是那个,那么还有其他几个。无论如何,我的主要建议是让这个集群达到4.8的最新版本。如果由于操作原因在短期内不实用,我至少会推动最新的4.0。这与当前之间确实存在很多修复:

https://docs.datastax.com/en/datastax_enterprise/4.0/datastax_enterprise/RNdse40.html

单独的4.0.7修复列表令人眼前一亮!