我正在以下配置下使用Java运行Apache Spark:
1) 1亿个输入行
2)spark-submit
配置:--conf spark.sql.shuffle.partitions=120 --total-executor-cores=120 --executor-memory=60GB --driver-memory=50G --executor-cores=10 --driver-cores=10
3)1个Master和2个Worker:每个worker都有502 GB的内存和88个虚拟CPU。
我尝试使用默认值spark.network.timeout
运行,但是它给出了错误消息:
has been quiet for 120000 ms while there are outstanding requests. Assuming connection is dead; please adjust spark.network.timeout if this is wrong.
因此,我将 spark.network.timeout的值更改为300000 ms。
现在,除了下面的日志外,代码中没有错误:
18/12/24 06:02:39 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/12/24 08:22:57 WARN Executor: Managed memory leak detected; size = 262144 bytes, TID = 874
18/12/24 08:22:57 WARN Executor: Managed memory leak detected; size = 262144 bytes, TID = 934
18/12/24 08:22:57 WARN Executor: Managed memory leak detected; size = 262144 bytes, TID = 850
18/12/24 08:22:57 WARN Executor: Managed memory leak detected; size = 262144 bytes, TID = 898
18/12/24 08:22:57 WARN Executor: Managed memory leak detected; size = 262144 bytes, TID = 886
18/12/24 08:22:57 WARN Executor: Managed memory leak detected; size = 262144 bytes, TID = 946
18/12/24 08:22:57 WARN Executor: Managed memory leak detected; size = 262144 bytes, TID = 958
18/12/24 08:22:57 WARN Executor: Managed memory leak detected; size = 262144 bytes, TID = 910
18/12/24 08:22:57 WARN Executor: Managed memory leak detected; size = 262144 bytes, TID = 862
18/12/24 08:27:10 ERROR CoarseGrainedExecutorBackend: RECEIVED SIGNAL TERM
4 bytes, TID = 934
18/12/24 08:22:57 WARN Executor: Managed memory leak detected; size = 262144 bytes, TID = 850
18/12/24 08:22:57 WARN Executor: Managed memory leak detected; size = 262144 bytes, TID = 898
18/12/24 08:22:57 WARN Executor: Managed memory leak detected; size = 262144 bytes, TID = 886
18/12/24 08:22:57 WARN Executor: Managed memory leak detected; size = 262144 bytes, TID = 946
18/12/24 08:22:57 WARN Executor: Managed memory leak detected; size = 262144 bytes, TID = 958
18/12/24 08:22:57 WARN Executor: Managed memory leak detected; size = 262144 bytes, TID = 910
18/12/24 08:22:57 WARN Executor: Managed memory leak detected; size = 262144 bytes, TID = 862
此处程序结束,没有错误,但尚未处理所有记录。它为某些记录提供了部分结果。
由于专有问题,我无法共享代码。
我的问题是: 我应该在现有配置中进行任何更改吗?还是它与我的硬件有关系?