Java.lang.OutOfMemoryError:写入镶木地板文件时的Java堆空间

时间:2015-09-21 19:12:49

标签: apache-spark apache-spark-sql pyspark

我收到java.lang.OutOfMemoryError:Java堆空间错误。我把代码减少到只有三步。打开镶木地板文件,然后编写它,我收到此错误。我的机器有128 GB的RAM,高清的tbs。我已经为spark-shell,pyspark和python代码尝试了以下步骤

这就是我做的事情

  1. 打开spark-shell(默认设置无附加参数)
  2. val asd = sqlContext.read.parquet(" / spark / data / xxx / yyy")
  3. asd.registerTempTable(" yyy")
  4. asd.write.parquet(" data / test")
  5. 我收到以下错误

    5/09/21 19:03:29 INFO CodecPool: Got brand-new compressor [.gz] 
    15/09/21 19:03:29 INFO CodecPool: Got brand-new compressor [.gz] 
    15/09/21 19:03:29 ERROR InsertIntoHadoopFsRelation: Aborting task. 
    java.lang.OutOfMemoryError: Java heap space 
    15/09/21 19:03:30 ERROR InsertIntoHadoopFsRelation: Aborting task. 
    java.lang.OutOfMemoryError: Java heap space 
    15/09/21 19:03:29 INFO CodecPool: Got brand-new compressor [.gz] 
    15/09/21 19:03:29 INFO ParquetOutputFormat: Validation is off 
    15/09/21 19:03:31 INFO ParquetOutputFormat: Writer version is: PARQUET_1_0 
    15/09/21 19:03:31 ERROR DefaultWriterContainer: Task attempt attempt_201509211903_0009_m_000004_0 aborted. 
    15/09/21 19:03:31 ERROR Executor: Exception in task 4.0 in stage 9.0 (TID 293) 
    org.apache.spark.SparkException: Task failed while writing rows. 
            at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation.org$apache$spark$sql$sources$InsertIntoHadoopFsRelation$$writeRows$1(commands.scala:191) 
            at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insert$1.apply(commands.scala:160) 
            at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insert$1.apply(commands.scala:160) 
            at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) 
            at org.apache.spark.scheduler.Task.run(Task.scala:70) 
            at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) 
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
            at java.lang.Thread.run(Thread.java:745) 
    Caused by: java.lang.OutOfMemoryError: Java heap space 
    15/09/21 19:03:31 ERROR InsertIntoHadoopFsRelation: Aborting task. 
    java.lang.OutOfMemoryError: Java heap space 
    15/09/21 19:03:31 ERROR DefaultWriterContainer: Task attempt attempt_201509211903_0009_m_000002_0 aborted. 
    15/09/21 19:03:31 ERROR Executor: Exception in task 2.0 in stage 9.0 (TID 291) 
    org.apache.spark.SparkException: Task failed while writing rows. 
            at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation.org$apache$spark$sql$sources$InsertIntoHadoopFsRelation$$writeRows$1(commands.scala:191) 
            at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insert$1.apply(commands.scala:160) 
            at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insert$1.apply(commands.scala:160) 
            at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) 
            at org.apache.spark.scheduler.Task.run(Task.scala:70) 
            at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) 
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
            at java.lang.Thread.run(Thread.java:745) 
    Caused by: java.lang.OutOfMemoryError: Java heap space 
    15/09/21 19:03:31 ERROR InsertIntoHadoopFsRelation: Aborting task. 
    java.lang.OutOfMemoryError: Java heap space 
    15/09/21 19:03:31 INFO CodecPool: Got brand-new compressor [.gz] 
    15/09/21 19:03:32 ERROR DefaultWriterContainer: Task attempt attempt_201509211903_0009_m_000003_0 aborted. 
    15/09/21 19:03:32 ERROR Executor: Exception in task 3.0 in stage 9.0 (TID 292) 
    org.apache.spark.SparkException: Task failed while writing rows. 
            at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation.org$apache$spark$sql$sources$InsertIntoHadoopFsRelation$$writeRows$1(commands.scala:191) 
            at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insert$1.apply(commands.scala:160) 
            at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insert$1.apply(commands.scala:160) 
            at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) 
            at org.apache.spark.scheduler.Task.run(Task.scala:70) 
            at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) 
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
            at java.lang.Thread.run(Thread.java:745) 
    Caused by: java.lang.OutOfMemoryError: Java heap space 
    15/09/21 19:03:32 ERROR InsertIntoHadoopFsRelation: Aborting task. 
    java.lang.OutOfMemoryError: Java heap space 
    15/09/21 19:03:32 ERROR DefaultWriterContainer: Task attempt attempt_201509211903_0009_m_000006_0 aborted. 
    15/09/21 19:03:32 INFO CodecPool: Got brand-new compressor [.gz] 
    15/09/21 19:03:32 ERROR InsertIntoHadoopFsRelation: Aborting task. 
    java.lang.OutOfMemoryError: Java heap space 
    15/09/21 19:03:33 ERROR Executor: Exception in task 6.0 in stage 9.0 (TID 295) 
    org.apache.spark.SparkException: Task failed while writing rows. 
            at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation.org$apache$spark$sql$sources$InsertIntoHadoopFsRelation$$writeRows$1(commands.scala:191) 
            at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insert$1.apply(commands.scala:160) 
            at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insert$1.apply(commands.scala:160) 
            at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) 
            at org.apache.spark.scheduler.Task.run(Task.scala:70) 
            at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) 
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
            at java.lang.Thread.run(Thread.java:745) 
    Caused by: java.lang.OutOfMemoryError: Java heap space 
    15/09/21 19:03:33 ERROR InsertIntoHadoopFsRelation: Aborting task. 
    java.lang.OutOfMemoryError: Java heap space 
    15/09/21 19:03:33 ERROR InsertIntoHadoopFsRelation: Aborting task. 
    java.lang.OutOfMemoryError: Java heap space 
    15/09/21 19:03:33 ERROR DefaultWriterContainer: Task attempt attempt_201509211903_0009_m_000008_0 aborted. 
    15/09/21 19:03:33 ERROR Executor: Exception in task 8.0 in stage 9.0 (TID 297) 
    org.apache.spark.SparkException: Task failed while writing rows.
    

0 个答案:

没有答案