由于超出内存限制而导致Spark Streaming失败"

时间:2016-06-15 05:18:31

标签: apache-spark spark-streaming rdd google-cloud-dataproc

我在YARN上运行Spark(1.6.1)Streaming,有8个节点。 从hdfs读取文件并写入ES。

转换: -

  1. mapToPair
  2. reduceByKey
  3. mapToPair
  4. 输出操作: -

    1. foreachRDD
    2. 当进程正在运行时,Receiver节点中的存储正在不断堆积。 我试图减少SparkStreamingContext-记得60secs。

      Total Uptime: 3.6 h
      Scheduling Mode: FIFO 
      Completed Jobs: 12798
      Completed Stages: 20592
      

      纱线日志错误:

      16/06/14 14:41:08 WARN org.apache.spark.scheduler.cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Container killed by YARN for exceeding memory limits. 40.1 GB of 40 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead.
      16/06/14 14:41:08 ERROR org.apache.spark.scheduler.cluster.YarnScheduler: Lost executor 1 on spark-metrics-0-w-7.c.orion-0010.internal: Container killed by YARN for exceeding memory limits. 40.1 GB of 40 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead.
      16/06/14 14:41:08 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in stage 2.0 (TID 70, spark-metrics-0-w-7.c.orion-0010.internal): ExecutorLostFailure (executor 1 exited caused by one of the running tasks) Reason: Container killed by YARN for exceeding memory limits. 40.1 GB of 40 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead.
      16/06/14 14:41:08 WARN org.apache.spark.network.server.TransportChannelHandler: Exception in connection from spark-metrics-0-w-7.c.orion-0010.internal/10.240.1.110:56101
      java.io.IOException: Connection reset by peer
          at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
          at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
          at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
          at sun.nio.ch.IOUtil.read(IOUtil.java:192)
          at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
          at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313)
          at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
          at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242)
          at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
          at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
          at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
          at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
          at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
          at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
          at java.lang.Thread.run(Thread.java:745)
      16/06/14 14:41:08 ERROR org.apache.spark.network.client.TransportResponseHandler: Still have 1 requests outstanding when connection from spark-metrics-0-w-7.c.orion-0010.internal/10.240.1.110:56101 is closed
      16/06/14 14:41:08 WARN org.apache.spark.storage.BlockManagerMaster: Failed to remove RDD 63963 - Connection reset by peer
      java.io.IOException: Connection reset by peer
          at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
          at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
          at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
          at sun.nio.ch.IOUtil.read(IOUtil.java:192)
          at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
          at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313)
          at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
          at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242)
          at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
          at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
          at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
          at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
          at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
          at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
          at java.lang.Thread.run(Thread.java:745)
      16/06/14 14:41:08 ERROR org.apache.spark.scheduler.cluster.YarnScheduler: Lost an executor 1 (already removed): Pending loss reason.
      16/06/14 14:41:08 ERROR org.apache.spark.streaming.scheduler.JobScheduler: Error running job streaming job 1465915267000 ms.0
      java.io.IOException: Connection reset by peer
          at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
          at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
          at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
          at sun.nio.ch.IOUtil.read(IOUtil.java:192)
          at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
          at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313)
          at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
          at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242)
          at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
          at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
          at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
          at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
          at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
          at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
          at java.lang.Thread.run(Thread.java:745)
      Exception in thread "main" java.io.IOException: Connection reset by peer
          at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
          at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
          at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
          at sun.nio.ch.IOUtil.read(IOUtil.java:192)
          at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
          at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313)
          at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
          at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242)
          at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
          at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
          at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
          at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
          at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
          at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
          at java.lang.Thread.run(Thread.java:745)
      16/06/14 14:41:10 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
      16/06/14 14:41:10 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
      16/06/14 14:41:10 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
      Job output is complete
      

0 个答案:

没有答案