我收到java.lang.OutOfMemoryError
:Java堆空间错误。我把代码减少到只有三步。打开镶木地板文件,然后编写它,我收到此错误。我的机器有128 GB的RAM,高清的tbs。我已经为spark-shell,pyspark和python代码尝试了以下步骤
这就是我做的事情
我收到以下错误
5/09/21 19:03:29 INFO CodecPool: Got brand-new compressor [.gz]
15/09/21 19:03:29 INFO CodecPool: Got brand-new compressor [.gz]
15/09/21 19:03:29 ERROR InsertIntoHadoopFsRelation: Aborting task.
java.lang.OutOfMemoryError: Java heap space
15/09/21 19:03:30 ERROR InsertIntoHadoopFsRelation: Aborting task.
java.lang.OutOfMemoryError: Java heap space
15/09/21 19:03:29 INFO CodecPool: Got brand-new compressor [.gz]
15/09/21 19:03:29 INFO ParquetOutputFormat: Validation is off
15/09/21 19:03:31 INFO ParquetOutputFormat: Writer version is: PARQUET_1_0
15/09/21 19:03:31 ERROR DefaultWriterContainer: Task attempt attempt_201509211903_0009_m_000004_0 aborted.
15/09/21 19:03:31 ERROR Executor: Exception in task 4.0 in stage 9.0 (TID 293)
org.apache.spark.SparkException: Task failed while writing rows.
at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation.org$apache$spark$sql$sources$InsertIntoHadoopFsRelation$$writeRows$1(commands.scala:191)
at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insert$1.apply(commands.scala:160)
at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insert$1.apply(commands.scala:160)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
at org.apache.spark.scheduler.Task.run(Task.scala:70)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Java heap space
15/09/21 19:03:31 ERROR InsertIntoHadoopFsRelation: Aborting task.
java.lang.OutOfMemoryError: Java heap space
15/09/21 19:03:31 ERROR DefaultWriterContainer: Task attempt attempt_201509211903_0009_m_000002_0 aborted.
15/09/21 19:03:31 ERROR Executor: Exception in task 2.0 in stage 9.0 (TID 291)
org.apache.spark.SparkException: Task failed while writing rows.
at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation.org$apache$spark$sql$sources$InsertIntoHadoopFsRelation$$writeRows$1(commands.scala:191)
at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insert$1.apply(commands.scala:160)
at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insert$1.apply(commands.scala:160)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
at org.apache.spark.scheduler.Task.run(Task.scala:70)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Java heap space
15/09/21 19:03:31 ERROR InsertIntoHadoopFsRelation: Aborting task.
java.lang.OutOfMemoryError: Java heap space
15/09/21 19:03:31 INFO CodecPool: Got brand-new compressor [.gz]
15/09/21 19:03:32 ERROR DefaultWriterContainer: Task attempt attempt_201509211903_0009_m_000003_0 aborted.
15/09/21 19:03:32 ERROR Executor: Exception in task 3.0 in stage 9.0 (TID 292)
org.apache.spark.SparkException: Task failed while writing rows.
at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation.org$apache$spark$sql$sources$InsertIntoHadoopFsRelation$$writeRows$1(commands.scala:191)
at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insert$1.apply(commands.scala:160)
at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insert$1.apply(commands.scala:160)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
at org.apache.spark.scheduler.Task.run(Task.scala:70)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Java heap space
15/09/21 19:03:32 ERROR InsertIntoHadoopFsRelation: Aborting task.
java.lang.OutOfMemoryError: Java heap space
15/09/21 19:03:32 ERROR DefaultWriterContainer: Task attempt attempt_201509211903_0009_m_000006_0 aborted.
15/09/21 19:03:32 INFO CodecPool: Got brand-new compressor [.gz]
15/09/21 19:03:32 ERROR InsertIntoHadoopFsRelation: Aborting task.
java.lang.OutOfMemoryError: Java heap space
15/09/21 19:03:33 ERROR Executor: Exception in task 6.0 in stage 9.0 (TID 295)
org.apache.spark.SparkException: Task failed while writing rows.
at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation.org$apache$spark$sql$sources$InsertIntoHadoopFsRelation$$writeRows$1(commands.scala:191)
at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insert$1.apply(commands.scala:160)
at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insert$1.apply(commands.scala:160)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
at org.apache.spark.scheduler.Task.run(Task.scala:70)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Java heap space
15/09/21 19:03:33 ERROR InsertIntoHadoopFsRelation: Aborting task.
java.lang.OutOfMemoryError: Java heap space
15/09/21 19:03:33 ERROR InsertIntoHadoopFsRelation: Aborting task.
java.lang.OutOfMemoryError: Java heap space
15/09/21 19:03:33 ERROR DefaultWriterContainer: Task attempt attempt_201509211903_0009_m_000008_0 aborted.
15/09/21 19:03:33 ERROR Executor: Exception in task 8.0 in stage 9.0 (TID 297)
org.apache.spark.SparkException: Task failed while writing rows.