pyspark spark-submit中的Java堆空间OutOfMemoryError?

时间:2017-12-29 08:25:54

标签: apache-spark pyspark

我的数据集大小为10GB(例如Test.txt)。

我在下面写了我的pyspark脚本(Test.py):

from pyspark import SparkConf
from pyspark.sql import SparkSession
from pyspark.sql import SQLContext
spark = SparkSession.builder.appName("FilterProduct").getOrCreate()
sc = spark.sparkContext
sqlContext = SQLContext(sc)
lines = spark.read.text("C:/Users/test/Desktop/Test.txt").rdd
lines.collect()

然后我使用以下命令执行上述脚本:

spark-submit Test.py --executor-memory  12G 

然后我收到如下错误:

17/12/29 13:27:18 INFO FileScanRDD: Reading File path: file:///C:/Users/test/Desktop/Test.txt, range: 402653184-536870912, partition values: [empty row]
17/12/29 13:27:18 INFO CodeGenerator: Code generated in 22.743725 ms
17/12/29 13:27:44 ERROR Executor: Exception in task 1.0 in stage 0.0 (TID 1)
java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOf(Arrays.java:3230)
        at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)
        at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
        at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140)
        at org.apache.spark.util.ByteBufferOutputStream.write(ByteBufferOutputStream.scala:41)
        at java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1877)
        at java.io.ObjectOutputStream$BlockDataOutputStream.setBlockDataMode(ObjectOutputStream.java:1786)
        at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1189)
        at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
        at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:43)
        at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:383)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
17/12/29 13:27:44 ERROR Executor: Exception in task 2.0 in stage 0.0 (TID 2)
java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOf(Arrays.java:3230)
        at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)
        at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93

请告诉我如何解决此问题?

3 个答案:

答案 0 :(得分:1)

您可以尝试--conf "spark.driver.maxResultSize=20g"。您应该检查spark conf page.spark.apache.org/docs/latest/configuration.html上的配置。

除了这个答案,我还建议你减少你的任务结果,否则你可能会遇到序列化问题。

答案 1 :(得分:0)

如果您的apache-spark目录中有文件apache-spark/2.4.0/libexec/conf/spark-defaults.conf,其中2.4.0对应于apache-spark版本。

该文件不存在,请创建它。

然后在文件末尾插入:spark.driver.memory 12g

无需--executor-memory 12G即可解决:只需执行spark-submit Test.py

答案 2 :(得分:-2)

您是否在spark-submit时检查了jvm的最大堆大小值。如果您看到传递给spark-submit的值,则意味着您可以正确设置最大堆大小。

例如,如果您的设置为4G ./spark-submit --driver-memory 4G Test123.py 您应该在jvisualvm屏幕上看到-Xmx4G,如下所示。 enter image description here

即使您可以正确设置最大堆大小,您也可能会看到新错误results of 7 tasks (1158.5 MB) is bigger than spark.driver.maxResultSize (1024.0 MB)

stacktrace就在这里

C:\Users\test\Desktop>spark-submit Test123.py -v --executor-memory  4G --driver-memory 10G  --conf spark.driver.maxResultSize=2g
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
18/01/12 11:00:58 INFO SparkContext: Running Spark version 2.2.0
18/01/12 11:00:58 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/01/12 11:00:58 INFO SparkContext: Submitted application: FilterProduct
18/01/12 11:00:58 INFO SecurityManager: Changing view acls to: test
18/01/12 11:00:58 INFO SecurityManager: Changing modify acls to: test
18/01/12 11:00:58 INFO SecurityManager: Changing view acls groups to:
18/01/12 11:00:58 INFO SecurityManager: Changing modify acls groups to:
18/01/12 11:00:58 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(test); groups with view permissions: Set(); users  with modify permissions: Set(test); groups with modify permissions: Set()
18/01/12 11:00:59 INFO Utils: Successfully started service 'sparkDriver' on port 63460.
18/01/12 11:00:59 INFO SparkEnv: Registering MapOutputTracker
18/01/12 11:00:59 INFO SparkEnv: Registering BlockManagerMaster
18/01/12 11:00:59 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
18/01/12 11:00:59 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
18/01/12 11:00:59 INFO DiskBlockManager: Created local directory at C:\Users\test\AppData\Local\Temp\blockmgr-49e8e874-1361-4fe3-a8f5-02beca717299
18/01/12 11:00:59 INFO MemoryStore: MemoryStore started with capacity 4.1 GB
18/01/12 11:00:59 INFO SparkEnv: Registering OutputCommitCoordinator
18/01/12 11:01:00 INFO Utils: Successfully started service 'SparkUI' on port 4040.
18/01/12 11:01:00 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.179.1:4040
18/01/12 11:01:00 INFO SparkContext: Added file file:/C:/Users/test/Desktop/Test123.py at file:/C:/Users/test/Desktop/Test123.py with timestamp 1515735060704
18/01/12 11:01:00 INFO Utils: Copying C:\Users\test\Desktop\Test123.py to C:\Users\test\AppData\Local\Temp\spark-0c5cf93c-e443-4ab1-b2ea-58e1b91fa310\userFiles-c067f30d-21b5-4f60-ab0d-7df3081b1d2a\Test123.py
18/01/12 11:01:01 INFO Executor: Starting executor ID driver on host localhost
18/01/12 11:01:01 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 63481.
18/01/12 11:01:01 INFO NettyBlockTransferService: Server created on 192.168.179.1:63481
18/01/12 11:01:01 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
18/01/12 11:01:01 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.179.1, 63481, None)
18/01/12 11:01:01 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.179.1:63481 with 4.1 GB RAM, BlockManagerId(driver, 192.168.179.1, 63481, None)
18/01/12 11:01:01 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.179.1, 63481, None)
18/01/12 11:01:01 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 192.168.179.1, 63481, None)
18/01/12 11:01:01 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/C:/Users/test/Desktop/spark-warehouse/').
18/01/12 11:01:01 INFO SharedState: Warehouse path is 'file:/C:/Users/test/Desktop/spark-warehouse/'.
18/01/12 11:01:02 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
18/01/12 11:01:06 INFO FileSourceStrategy: Pruning directories with:
18/01/12 11:01:06 INFO FileSourceStrategy: Post-Scan Filters:
18/01/12 11:01:06 INFO FileSourceStrategy: Output Data Schema: struct<value: string>
18/01/12 11:01:06 INFO FileSourceScanExec: Pushed Filters:
18/01/12 11:01:07 INFO CodeGenerator: Code generated in 385.116819 ms
18/01/12 11:01:07 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 277.7 KB, free 4.1 GB)
18/01/12 11:01:08 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 23.4 KB, free 4.1 GB)
18/01/12 11:01:08 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.179.1:63481 (size: 23.4 KB, free: 4.1 GB)
18/01/12 11:01:08 INFO SparkContext: Created broadcast 0 from javaToPython at NativeMethodAccessorImpl.java:0
18/01/12 11:01:08 INFO FileSourceScanExec: Planning scan with bin packing, max size: 134217728 bytes, open cost is considered as scanning 4194304 bytes.
18/01/12 11:01:08 INFO SparkContext: Starting job: collect at C:/Users/test/Desktop/Test123.py:15
18/01/12 11:01:09 INFO DAGScheduler: Got job 0 (collect at C:/Users/test/Desktop/Test123.py:15) with 72 output partitions
18/01/12 11:01:09 INFO DAGScheduler: Final stage: ResultStage 0 (collect at C:/Users/test/Desktop/Test123.py:15)
18/01/12 11:01:09 INFO DAGScheduler: Parents of final stage: List()
18/01/12 11:01:09 INFO DAGScheduler: Missing parents: List()
18/01/12 11:01:09 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[3] at javaToPython at NativeMethodAccessorImpl.java:0), which has no missing parents
18/01/12 11:01:09 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 7.2 KB, free 4.1 GB)
18/01/12 11:01:09 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 3.9 KB, free 4.1 GB)
18/01/12 11:01:09 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 192.168.179.1:63481 (size: 3.9 KB, free: 4.1 GB)
18/01/12 11:01:09 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1006
18/01/12 11:01:09 INFO DAGScheduler: Submitting 72 missing tasks from ResultStage 0 (MapPartitionsRDD[3] at javaToPython at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14))
18/01/12 11:01:09 INFO TaskSchedulerImpl: Adding task set 0.0 with 72 tasks
18/01/12 11:01:09 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 5290 bytes)
18/01/12 11:01:09 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, executor driver, partition 1, PROCESS_LOCAL, 5290 bytes)
18/01/12 11:01:09 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, localhost, executor driver, partition 2, PROCESS_LOCAL, 5290 bytes)
18/01/12 11:01:09 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 3, localhost, executor driver, partition 3, PROCESS_LOCAL, 5290 bytes)
18/01/12 11:01:09 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
18/01/12 11:01:09 INFO Executor: Running task 2.0 in stage 0.0 (TID 2)
18/01/12 11:01:09 INFO Executor: Running task 3.0 in stage 0.0 (TID 3)
18/01/12 11:01:09 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
18/01/12 11:01:09 INFO Executor: Fetching file:/C:/Users/test/Desktop/Test123.py with timestamp 1515735060704
18/01/12 11:01:09 INFO Utils: C:\Users\test\Desktop\Test123.py has been previously copied to C:\Users\test\AppData\Local\Temp\spark-0c5cf93c-e443-4ab1-b2ea-58e1b91fa310\userFiles-c067f30d-21b5-4f60-ab0d-7df3081b1d2a\Test123.py
18/01/12 11:01:09 INFO FileScanRDD: Reading File path: file:///C:/Users/test/Desktop/copy/SaleLine/SaleLine, range: 134217728-268435456, partition values: [empty row]
18/01/12 11:01:09 INFO FileScanRDD: Reading File path: file:///C:/Users/test/Desktop/copy/SaleLine/SaleLine, range: 402653184-536870912, partition values: [empty row]
18/01/12 11:01:09 INFO FileScanRDD: Reading File path: file:///C:/Users/test/Desktop/copy/SaleLine/SaleLine, range: 0-134217728, partition values: [empty row]
18/01/12 11:01:09 INFO FileScanRDD: Reading File path: file:///C:/Users/test/Desktop/copy/SaleLine/SaleLine, range: 268435456-402653184, partition values: [empty row]
18/01/12 11:01:09 INFO CodeGenerator: Code generated in 19.618119 ms
18/01/12 11:01:33 INFO MemoryStore: Block taskresult_1 stored as bytes in memory (estimated size 198.1 MB, free 3.9 GB)
18/01/12 11:01:33 INFO BlockManagerInfo: Added taskresult_1 in memory on 192.168.179.1:63481 (size: 198.1 MB, free: 3.9 GB)
18/01/12 11:01:33 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 207722505 bytes result sent via BlockManager)
18/01/12 11:01:33 INFO TaskSetManager: Starting task 4.0 in stage 0.0 (TID 4, localhost, executor driver, partition 4, PROCESS_LOCAL, 5290 bytes)
18/01/12 11:01:34 INFO MemoryStore: Block taskresult_0 stored as bytes in memory (estimated size 198.0 MB, free 3.7 GB)
18/01/12 11:01:34 INFO Executor: Running task 4.0 in stage 0.0 (TID 4)
18/01/12 11:01:34 INFO BlockManagerInfo: Added taskresult_0 in memory on 192.168.179.1:63481 (size: 198.0 MB, free: 3.7 GB)
18/01/12 11:01:34 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 207614031 bytes result sent via BlockManager)
18/01/12 11:01:34 INFO FileScanRDD: Reading File path: file:///C:/Users/test/Desktop/copy/SaleLine/SaleLine, range: 536870912-671088640, partition values: [empty row]
18/01/12 11:01:34 INFO TaskSetManager: Starting task 5.0 in stage 0.0 (TID 5, localhost, executor driver, partition 5, PROCESS_LOCAL, 5290 bytes)
18/01/12 11:01:34 INFO Executor: Running task 5.0 in stage 0.0 (TID 5)
18/01/12 11:01:34 INFO FileScanRDD: Reading File path: file:///C:/Users/test/Desktop/copy/SaleLine/SaleLine, range: 671088640-805306368, partition values: [empty row]
18/01/12 11:01:34 INFO MemoryStore: Block taskresult_2 stored as bytes in memory (estimated size 198.1 MB, free 3.5 GB)
18/01/12 11:01:35 INFO BlockManagerInfo: Added taskresult_2 in memory on 192.168.179.1:63481 (size: 198.1 MB, free: 3.5 GB)
18/01/12 11:01:35 INFO Executor: Finished task 2.0 in stage 0.0 (TID 2). 207773776 bytes result sent via BlockManager)
18/01/12 11:01:35 INFO TaskSetManager: Starting task 6.0 in stage 0.0 (TID 6, localhost, executor driver, partition 6, PROCESS_LOCAL, 5290 bytes)
18/01/12 11:01:35 INFO Executor: Running task 6.0 in stage 0.0 (TID 6)
18/01/12 11:01:35 INFO FileScanRDD: Reading File path: file:///C:/Users/test/Desktop/copy/SaleLine/SaleLine, range: 805306368-939524096, partition values: [empty row]
18/01/12 11:01:35 INFO MemoryStore: Block taskresult_3 stored as bytes in memory (estimated size 198.0 MB, free 3.3 GB)
18/01/12 11:01:35 INFO BlockManagerInfo: Added taskresult_3 in memory on 192.168.179.1:63481 (size: 198.0 MB, free: 3.3 GB)
18/01/12 11:01:35 INFO Executor: Finished task 3.0 in stage 0.0 (TID 3). 207626541 bytes result sent via BlockManager)
18/01/12 11:01:35 INFO TaskSetManager: Starting task 7.0 in stage 0.0 (TID 7, localhost, executor driver, partition 7, PROCESS_LOCAL, 5290 bytes)
18/01/12 11:01:35 INFO Executor: Running task 7.0 in stage 0.0 (TID 7)
18/01/12 11:01:35 INFO FileScanRDD: Reading File path: file:///C:/Users/test/Desktop/copy/SaleLine/SaleLine, range: 939524096-1073741824, partition values: [empty row]
18/01/12 11:01:35 INFO TransportClientFactory: Successfully created connection to /192.168.179.1:63481 after 345 ms (0 ms spent in bootstraps)
18/01/12 11:01:40 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 30921 ms on localhost (executor driver) (1/72)
18/01/12 11:01:40 INFO BlockManagerInfo: Removed taskresult_0 on 192.168.179.1:63481 in memory (size: 198.0 MB, free: 3.5 GB)
18/01/12 11:01:42 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 33147 ms on localhost (executor driver) (2/72)
18/01/12 11:01:42 INFO BlockManagerInfo: Removed taskresult_1 on 192.168.179.1:63481 in memory (size: 198.1 MB, free: 3.7 GB)
18/01/12 11:01:44 INFO TaskSetManager: Finished task 3.0 in stage 0.0 (TID 3) in 34693 ms on localhost (executor driver) (3/72)
18/01/12 11:01:44 INFO BlockManagerInfo: Removed taskresult_3 on 192.168.179.1:63481 in memory (size: 198.0 MB, free: 3.9 GB)
18/01/12 11:01:46 INFO TaskSetManager: Finished task 2.0 in stage 0.0 (TID 2) in 36860 ms on localhost (executor driver) (4/72)
18/01/12 11:01:46 INFO BlockManagerInfo: Removed taskresult_2 on 192.168.179.1:63481 in memory (size: 198.1 MB, free: 4.1 GB)
18/01/12 11:01:55 INFO MemoryStore: Block taskresult_4 stored as bytes in memory (estimated size 198.1 MB, free 3.9 GB)
18/01/12 11:01:55 INFO BlockManagerInfo: Added taskresult_4 in memory on 192.168.179.1:63481 (size: 198.1 MB, free: 3.9 GB)
18/01/12 11:01:55 INFO Executor: Finished task 4.0 in stage 0.0 (TID 4). 207685862 bytes result sent via BlockManager)
18/01/12 11:01:55 INFO TaskSetManager: Starting task 8.0 in stage 0.0 (TID 8, localhost, executor driver, partition 8, PROCESS_LOCAL, 5290 bytes)
18/01/12 11:01:55 INFO Executor: Running task 8.0 in stage 0.0 (TID 8)
18/01/12 11:01:55 INFO FileScanRDD: Reading File path: file:///C:/Users/test/Desktop/copy/SaleLine/SaleLine, range: 1073741824-1207959552, partition values: [empty row]
18/01/12 11:01:56 INFO MemoryStore: Block taskresult_5 stored as bytes in memory (estimated size 198.1 MB, free 3.7 GB)
18/01/12 11:01:56 INFO BlockManagerInfo: Added taskresult_5 in memory on 192.168.179.1:63481 (size: 198.1 MB, free: 3.7 GB)
18/01/12 11:01:56 INFO Executor: Finished task 5.0 in stage 0.0 (TID 5). 207774348 bytes result sent via BlockManager)
18/01/12 11:01:57 INFO TaskSetManager: Starting task 9.0 in stage 0.0 (TID 9, localhost, executor driver, partition 9, PROCESS_LOCAL, 5290 bytes)
18/01/12 11:01:57 ERROR TaskSetManager: Total size of serialized results of 6 tasks (1188.5 MB) is bigger than spark.driver.maxResultSize (1024.0 MB)
18/01/12 11:01:57 INFO Executor: Running task 9.0 in stage 0.0 (TID 9)
18/01/12 11:01:57 INFO BlockManagerInfo: Removed taskresult_5 on 192.168.179.1:63481 in memory (size: 198.1 MB, free: 3.9 GB)
18/01/12 11:01:57 INFO FileScanRDD: Reading File path: file:///C:/Users/test/Desktop/copy/SaleLine/SaleLine, range: 1207959552-1342177280, partition values: [empty row]
18/01/12 11:01:57 INFO TaskSchedulerImpl: Cancelling stage 0
18/01/12 11:01:57 INFO TaskSchedulerImpl: Stage 0 was cancelled
18/01/12 11:01:57 INFO Executor: Executor is trying to kill task 9.0 in stage 0.0 (TID 9), reason: stage cancelled
18/01/12 11:01:57 INFO Executor: Executor is trying to kill task 6.0 in stage 0.0 (TID 6), reason: stage cancelled
18/01/12 11:01:57 INFO Executor: Executor killed task 9.0 in stage 0.0 (TID 9), reason: stage cancelled
18/01/12 11:01:57 INFO DAGScheduler: ResultStage 0 (collect at C:/Users/test/Desktop/Test123.py:15) failed in 47.966 s due to Job aborted due to stage failure: Total size of serialized results of 6 tasks (1188.5 MB) is bigger than spark.driver.maxResultSize (1024.0 MB)
18/01/12 11:01:57 INFO MemoryStore: Block taskresult_7 stored as bytes in memory (estimated size 198.1 MB, free 3.7 GB)
18/01/12 11:01:57 INFO Executor: Executor is trying to kill task 7.0 in stage 0.0 (TID 7), reason: stage cancelled
18/01/12 11:01:57 INFO BlockManagerInfo: Added taskresult_7 in memory on 192.168.179.1:63481 (size: 198.1 MB, free: 3.7 GB)
18/01/12 11:01:57 INFO Executor: Executor is trying to kill task 8.0 in stage 0.0 (TID 8), reason: stage cancelled
18/01/12 11:01:57 INFO Executor: Finished task 7.0 in stage 0.0 (TID 7). 207695395 bytes result sent via BlockManager)
18/01/12 11:01:57 WARN TaskSetManager: Lost task 9.0 in stage 0.0 (TID 9, localhost, executor driver): TaskKilled (stage cancelled)
18/01/12 11:01:57 INFO DAGScheduler: Job 0 failed: collect at C:/Users/test/Desktop/Test123.py:15, took 48.556068 s
18/01/12 11:01:57 INFO Executor: Executor killed task 8.0 in stage 0.0 (TID 8), reason: stage cancelled
18/01/12 11:01:57 ERROR TaskSetManager: Total size of serialized results of 7 tasks (1386.5 MB) is bigger than spark.driver.maxResultSize (1024.0 MB)
18/01/12 11:01:57 WARN TaskSetManager: Lost task 8.0 in stage 0.0 (TID 8, localhost, executor driver): TaskKilled (stage cancelled)
18/01/12 11:01:57 INFO BlockManagerInfo: Removed taskresult_7 on 192.168.179.1:63481 in memory (size: 198.1 MB, free: 3.9 GB)
Traceback (most recent call last):
  File "C:/Users/test/Desktop/Test123.py", line 15, in <module>
    lines.collect()
  File "D:\workspace\spark-2.2.0-bin-hadoop2.7\python\lib\pyspark.zip\pyspark\rdd.py", line 809, in collect
  File "D:\workspace\spark-2.2.0-bin-hadoop2.7\python\lib\py4j-0.10.4-src.zip\py4j\java_gateway.py", line 1133, in __call__
  File "D:\workspace\spark-2.2.0-bin-hadoop2.7\python\lib\pyspark.zip\pyspark\sql\utils.py", line 63, in deco
  File "D:\workspace\spark-2.2.0-bin-hadoop2.7\python\lib\py4j-0.10.4-src.zip\py4j\protocol.py", line 319, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: org.apache.spark.SparkException: Job aborted due to stage failure: Total size of serialized results of 6 tasks (1188.5 MB) is bigger than spark.driver.maxResultSize (1024.0 MB)
        at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1499)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1487)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1486)
        at scala.collection.m18/01/12 11:01:57 INFO MemoryStore: Block taskresult_6 stored as bytes in memory (estimated size 198.1 MB, free 3.5 GB)
utable.ResizableArray$class.foreach(ResizableArray.scala:59)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
        at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1486)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814)
        at scala.Option.foreach(Option.scala:257)
        at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:814)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1714)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1669)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1658)
        at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
        at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:630)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:2022)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:2043)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:2062)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:2087)
        at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:936)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
        at org.apache.spark.rdd.RDD.withScope(RDD.scala:362)
        at org.apache.spark.rdd.RDD.collect(RDD.scala:935)
        at org.apache.spark.api.python.PythonRDD$.collectAndServe(PythonRDD.scala:458)
        at org.apache.spark.api.python.PythonRDD.collectAndServe(PythonRDD.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.ref18/01/12 11:01:57 INFO BlockManagerInfo: Added taskresult_6 in memory on 192.168.179.1:63481 (size: 198.1 MB, free: 3.7 GB)
lect.Method.invoke(Method.java:483)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:280)
        at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
        at py4j.commands.CallCommand.execute(CallCommand.java:79)
        at py4j.GatewayConnection.run(GatewayConnection.java:214)
        at java.lang.Thread.run(Thread.java:745)

18/01/12 11:01:57 INFO SparkContext: Invoking stop() from shutdown hook
18/01/12 11:01:57 INFO Executor: Finished task 6.0 in stage 0.0 (TID 6). 207681096 bytes result sent via BlockManager)
18/01/12 11:01:57 ERROR TaskSetManager: Total size of serialized results of 8 tasks (1584.6 MB) is bigger than spark.driver.maxResultSize (1024.0 MB)
18/01/12 11:01:57 INFO SparkUI: Stopped Spark web UI at http://192.168.179.1:4040
18/01/12 11:01:57 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
18/01/12 11:01:57 INFO BlockManagerInfo: Removed taskresult_6 on 192.168.179.1:63481 in memory (size: 198.1 MB, free: 3.9 GB)
18/01/12 11:01:57 ERROR Utils: Uncaught exception in thread task-result-getter-1
java.lang.InterruptedException
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
        at scala.concurrent.impl.Promise$DefaultPromise.tryAwait(Promise.scala:202)
        at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:218)
        at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
        at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:201)
        at org.apache.spark.network.BlockTransferService.fetchBlockSync(BlockTransferService.scala:105)
        at org.apache.spark.storage.BlockManager.getRemoteBytes(BlockManager.scala:642)
        at org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$1.apply$mcV$sp(TaskResultGetter.scala:82)
        at org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$1.apply(TaskResultGetter.scala:63)
        at org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$1.apply(TaskResultGetter.scala:63)
        at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1954)
        at org.apache.spark.scheduler.TaskResultGetter$$anon$3.run(TaskResultGetter.scala:62)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Exception in thread "task-result-getter-1" java.lang.Error: java.lang.InterruptedException
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1148)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.InterruptedException
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
        at scala.concurrent.impl.Promise$DefaultPromise.tryAwait(Promise.scala:202)
        at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:218)
        at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
        at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:201)
        at org.apache.spark.network.BlockTransferService.fetchBlockSync(BlockTransferService.scala:105)
        at org.apache.spark.storage.BlockManager.getRemoteBytes(BlockManager.scala:642)
        at org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$1.apply$mcV$sp(TaskResultGetter.scala:82)
        at org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$1.apply(TaskResultGetter.scala:63)
        at org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$1.apply(TaskResultGetter.scala:63)
        at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1954)
        at org.apache.spark.scheduler.TaskResultGetter$$anon$3.run(TaskResultGetter.scala:62)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        ... 2 more
18/01/12 11:01:57 ERROR LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerBlockUpdated(BlockUpdatedInfo(BlockManagerId(driver, 192.168.179.1, 63481, None),taskresult_6,StorageLevel(1 replicas),0,0))
18/01/12 11:01:58 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
18/01/12 11:01:58 ERROR TransportRequestHandler: Error sending result ChunkFetchSuccess{streamChunkId=StreamChunkId{streamId=882113328004, chunkIndex=0}, buffer=org.apache.spark.storage.BlockManagerManagedBuffer@f00f35c} to /192.168.179.1:63484; closing connection
java.nio.channels.ClosedChannelException
        at io.netty.channel.AbstractChannel$AbstractUnsafe.close(...)(Unknown Source)
18/01/12 11:01:58 INFO MemoryStore: MemoryStore cleared
18/01/12 11:01:58 INFO BlockManager: BlockManager stopped
18/01/12 11:01:58 INFO BlockManagerMaster: BlockManagerMaster stopped
18/01/12 11:01:58 ERROR TransportResponseHandler: Still have 1 requests outstanding when connection from /192.168.179.1:63481 is closed
18/01/12 11:01:58 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
18/01/12 11:01:58 INFO RetryingBlockFetcher: Retrying fetch (1/3) for 1 outstanding blocks after 5000 ms
18/01/12 11:01:58 INFO SparkContext: Successfully stopped SparkContext
18/01/12 11:01:58 INFO ShutdownHookManager: Shutdown hook called
18/01/12 11:01:58 INFO ShutdownHookManager: Deleting directory C:\Users\test\AppData\Local\Temp\spark-0c5cf93c-e443-4ab1-b2ea-58e1b91fa310\pyspark-90628145-a2e4-437f-bbe5-ef41ed4ba64f
18/01/12 11:01:58 INFO ShutdownHookManager: Deleting directory C:\Users\test\AppData\Local\Temp\spark-0c5cf93c-e443-4ab1-b2ea-58e1b91fa310

您可以像spark-submit --driver-memory 2g --conf "spark.driver.maxResultSize=2g" Test.py

一样调整spark.driver.maxResultSize值

即使您收到错误,也可以通过此命令共享详细信息。 spark-submit --driver-memory 10g --conf "spark.driver.maxResultSize=2g" Test.py所以我们可以看到您的平台特定条件。