为什么小数据大小会导致GC开销限制异常?

时间:2016-04-20 02:42:48

标签: java python apache-spark garbage-collection

我使用spark进行一些计算。 基本上我做了两件事:

  1. 新文件会定期进入文件夹
  2. 我将新文件转换为数据框,然后将其插入到先前的数据框中。
  3. (你可能会问我为什么要循环读它。我这样做是因为某些原因: 这些文件不是一次出现的。实际上它会定期出现。所以我不能马上阅读它们。 虽然Stream可以做到这一点。我不想使用流。因为使用Stream I需要设置一个长窗口。调试和测试并不简单 )

    代码如下:

    # Get the file list in the HDFS directory
    client = InsecureClient('http://10.79.148.184:50070')
    file_list = client.list('/test')
    
    df_total = None
    counter = 0
    for file in file_list:
        counter += 1
    
        # turn each file (CSV format) into data frame
        lines = sc.textFile("/test/%s" % file)
        parts = lines.map(lambda l: l.split(","))
        rows = parts.map(lambda p: Row(router=p[0], interface=int(p[1]), protocol=p[7],bit=int(p[10])))
        df = sqlContext.createDataFrame(rows)
    
        # do some transform on the data frame
        df_protocol = df.groupBy(['protocol']).agg(func.sum('bit').alias('bit'))
    
        # add the current data frame to previous data frame set
        if not df_total:
            df_total = df_protocol
        else:
            df_total = df_total.unionAll(df_protocol)
    
        # cache the df_total
        df_total.cache()
        if counter % 5 == 0:
            df_total.rdd.checkpoint()
    
        # get the df_total information
        df_total.show()
    

    我知道随着时间的推移,df_total可能很大。但实际上,在此之前,上面的代码已经引发异常。

    当循环约为30个循环时。代码抛出GC开销限制超出了异常。该文件非常小,所以即使300个循环,数据大小也只能是几MB。我不知道为什么它会抛出GC错误。

    例外情况如下:

    Exception in thread "dispatcher-event-loop-0" java.lang.OutOfMemoryError: GC overhead limit exceeded
        at java.lang.Integer.toString(Integer.java:331)
        at java.lang.Integer.toString(Integer.java:739)
        at java.lang.String.valueOf(String.java:2854)
        at scala.collection.mutable.StringBuilder.append(StringBuilder.scala:197)
        at org.apache.spark.storage.RDDBlockId.name(BlockId.scala:53)
        at org.apache.spark.storage.BlockId.equals(BlockId.scala:46)
        at java.util.HashMap.getEntry(HashMap.java:471)
        at java.util.HashMap.get(HashMap.java:421)
        at org.apache.spark.storage.BlockManagerMasterEndpoint.org$apache$spark$storage$BlockManagerMasterEndpoint$$getLocations(BlockManagerMasterEndpoint.scala:371)
        at org.apache.spark.storage.BlockManagerMasterEndpoint$$anonfun$org$apache$spark$storage$BlockManagerMasterEndpoint$$getLocationsMultipleBlockIds$1.apply(BlockManagerMasterEndpoint.scala:376)
        at org.apache.spark.storage.BlockManagerMasterEndpoint$$anonfun$org$apache$spark$storage$BlockManagerMasterEndpoint$$getLocationsMultipleBlockIds$1.apply(BlockManagerMasterEndpoint.scala:376)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
        at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
        at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
        at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
        at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
        at org.apache.spark.storage.BlockManagerMasterEndpoint.org$apache$spark$storage$BlockManagerMasterEndpoint$$getLocationsMultipleBlockIds(BlockManagerMasterEndpoint.scala:376)
        at org.apache.spark.storage.BlockManagerMasterEndpoint$$anonfun$receiveAndReply$1.applyOrElse(BlockManagerMasterEndpoint.scala:72)
        at org.apache.spark.rpc.netty.Inbox$$anonfun$process$1.apply$mcV$sp(Inbox.scala:104)
        at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:204)
        at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
        at org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:215)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
    16/04/20 09:52:00 ERROR TaskSchedulerImpl: Lost executor 0 on ES01: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.
    16/04/20 09:52:12 ERROR TransportRequestHandler: Error sending result RpcResponse{requestId=4721950849479578179, body=NioManagedBuffer{buf=java.nio.HeapByteBuffer[pos=0 lim=47 cap=47]}} to ES01/10.79.148.184:53059; closing connection
    io.netty.handler.codec.EncoderException: java.lang.OutOfMemoryError: Java heap space
        at io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:107)
        at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:633)
        at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:691)
        at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:626)
        at io.netty.handler.timeout.IdleStateHandler.write(IdleStateHandler.java:284)
        at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:633)
        at io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:32)
        at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:908)
        at io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:960)
        at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:893)
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
        at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
        at java.lang.Thread.run(Thread.java:744)
    Caused by: java.lang.OutOfMemoryError: Java heap space
        at io.netty.buffer.PoolArena$HeapArena.newChunk(PoolArena.java:602)
        at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:228)
        at io.netty.buffer.PoolArena.allocate(PoolArena.java:204)
        at io.netty.buffer.PoolArena.allocate(PoolArena.java:132)
        at io.netty.buffer.PooledByteBufAllocator.newHeapBuffer(PooledByteBufAllocator.java:256)
        at io.netty.buffer.AbstractByteBufAllocator.heapBuffer(AbstractByteBufAllocator.java:136)
        at io.netty.buffer.AbstractByteBufAllocator.heapBuffer(AbstractByteBufAllocator.java:127)
        at org.apache.spark.network.protocol.MessageEncoder.encode(MessageEncoder.java:77)
        at org.apache.spark.network.protocol.MessageEncoder.encode(MessageEncoder.java:33)
        at io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:89)
        ... 13 more
    

0 个答案:

没有答案