如何解释其他执行者选择任务之前的长时间延迟?

时间:2019-03-30 11:15:53

标签: apache-spark spark-graphx

您如何解释Spark graphx预凝胶执行的以下结果?在恒定大小的火花簇中运行单个graphx pregel作业。

该代码可用in GitHub,尽管我认为没有人会深入研究细节,因为它相当复杂。

我的参数:

bin/spark-submit \
    --master k8s://http://localhost:8001 \
    --deploy-mode cluster \
    --name iga-adi-graph \
    --driver-cores 3 \
    --driver-memory 5G \
    --executor-cores 3 \
    --executor-memory 6G \
    --conf spark.executor.instances=10 \
    --conf spark.default.parallelism=30 \
    --conf spark.kubernetes.executor.request.cores=3000m \
    --conf spark.kubernetes.executor.limit.cores=3000m \
    --conf spark.kubernetes.memoryOverheadFactor=0.2 \
    --conf spark.kubernetes.container.image.pullPolicy=Always \
    --conf spark.kubernetes.container.image=kbhit/iga-adi-pregel \
    --conf spark.scheduler.minRegisteredResourcesRatio=1.0 \
    --conf spark.scheduler.maxRegisteredResourcesWaitingTime=300s \
    --files /opt/metrics.properties \
    --conf spark.metrics.conf=/opt/metrics.properties \
    --jars /opt/metrics-influxdb.jar,/opt/spark-influx-sink.jar \
    --conf spark.driver.extraClassPath=spark-influx-sink.jar:metrics-influxdb.jar  \
    --conf spark.executor.extraClassPath=/opt/spark-influx-sink.jar:/opt/metrics-influxdb.jar  \
    --conf spark.executor.extraJavaOptions="" \
    --conf spark.driver.extraJavaOptions="-Dproblem.size=192 -Dproblem.steps=1" \
    --conf spark.kryo.unsafe=true \
    --conf spark.kryoserializer.buffer=32m \
    --conf spark.network.timeout=360s \
    --conf spark.memory.fraction=0.5 \
    --conf spark.cleaner.periodicGC.interval=10s \
    --conf spark.locality.wait.node=0 \
    --conf spark.locality.wait=9999999 \
    --conf spark.kubernetes.executor.volumes.emptyDir.mycheckpoints.mount.path=/tmp/checkpoints \
    --conf spark.kubernetes.driver.volumes.emptyDir.mycheckpoints.mount.path=/tmp/checkpoints \
    --class edu.agh.kboom.iga.adi.graph.IgaAdiPregelSolver \
    local:///opt/iga-adi-pregel.jar &

Long warm up

0 个答案:

没有答案