Spark丢弃SparkListenerEvent,因为事件队列中没有剩余空间

时间:2017-05-22 08:53:50

标签: python apache-spark pyspark cloudera-cdh

我正在使用pyspark来运行我的spark任务,当我尝试执行以下python脚本时,我遇到了以下错误。

from pyspark.sql.functions import count, sum, stddev_pop, mean, length

comp_df = sqlContext.sql('SELECT * FROM default.cdr_data')
df1 = comp_df.groupBy('number', 'type', 'week').agg(sum('callduration').alias('call_sum'), count('callduration').alias('call_count'), sum('iscompethot').alias('call_count_comp'))
df2 = df1.groupBy('number', 'type').agg((stddev_pop('call_sum') / mean('call_sum')).alias('coefficiant_of_variance'), sum('call_count').alias('call_count'), sum('call_count_competitor').alias('call_count_comp'))

错误和警告:

ERROR scheduler.LiveListenerBus: Dropping SparkListenerEvent because no remaining room in event queue. This likely means one of the SparkListeners is too slow and cannot keep up with the rate at which tasks are being started by the scheduler.
WARN scheduler.LiveListenerBus: Dropped 1 SparkListenerEvents since Thu Jan 01 05:30:00 IST 1970

接下来是一再发生错误,

ERROR cluster.YarnScheduler: Lost executor 149 on slv1.cdh-prod.com: Slave lost
WARN scheduler.TaskSetManager: Lost task 12.0 in stage 1.0 (TID 88088, slv1.cdh-prod.com, executor 149): ExecutorLostFailure (executor 149 exited caused by one of the running tasks) Reason: Slave lost
ERROR client.TransportClient: Failed to send RPC 5181111296686128218 to slv2.cdh-prod.com/192.168.x.xx:57156: java.nio.channels.ClosedChannelException
java.nio.channels.ClosedChannelException
ERROR cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Sending RequestExecutors(83360,88076,Map(slv3.cdh-prod.com -> 88076, slv1.cdh-prod.com -> 88076, slv2.cdh-prod.com -> 88076),Set()) to AM was unsuccessful
java.io.IOException: Failed to send RPC 5181111296686128218 to slv2.cdh-prod.com/192.168.x.xx:57156: java.nio.channels.ClosedChannelException
        at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:239)
        at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:226)
        at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
        at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:567)
        at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:424)
        at io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetFailure(AbstractChannel.java:801)
        at io.netty.channel.AbstractChannel$AbstractUnsafe.write(AbstractChannel.java:699)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.write(DefaultChannelPipeline.java:1122)
        at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:633)
        at io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:32)
        at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:908)
        at io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:960)
        at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:893)
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
        at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.nio.channels.ClosedChannelException

我尝试了一些解决方案, 增加了spark.scheduler.listenerbus.eventqueue.size的价值,但没有奏效。我甚至试过一个小数据集,但仍然得到错误。

0 个答案:

没有答案