Pyspark java.net.SocketTimeoutException:接受超时

时间:2019-01-02 15:26:03

标签: pyspark pyspark-sql

我有以下代码:

a)生成本地Spark实例:

# Load data from local machine into dataframe
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("Basic").master("local[*]").config("spark.network.timeout","50s").config("spark.executor.heartbeatInterval", "50s").getOrCreate();

b)生成一个熊猫日期范围,然后将其转换为PySpark数据框

import numpy as np
# Get the min and max dates
minDate, maxDate = df2.select(f.min("MonthlyTransactionDate"), f.max("MonthlyTransactionDate")).first()
d = pd.date_# Create aggregated dataset for analysis
df.registerTempTable("tmp")
sqlstr="SELECT " + groupBy + ", sum(" +amount +") as Amount FROM tmp GROUP BY " + groupBy
df2 = spark.sql(sqlstr)
spark.catalog.dropTempView('tmp')
display(df2.groupBy(monthlyTransactionDate).sum("Amount").orderBy(monthlyTransactionDate))range(start=minDate, end=maxDate, freq='MS')    
tmp = pd.DataFrame(d)
df3 = spark.createDataFrame(tmp)

但是我得到的是超时错误:

Py4JJavaError: An error occurred while calling o932.collectToPython.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 39.0 failed 1 times, most recent failure: Lost task 0.0 in stage 39.0 (TID 1239, localhost, executor driver): org.apache.spark.SparkException: Python worker failed to connect back.
    at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:170)
    at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:97)
    at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:117)
    at ...
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.SocketTimeoutException: Accept timed out
    at java.net.DualStackPlainSocketImpl.waitForNewConnection(Native Method)
    at java.net.DualStackPlainSocketImpl.socketAccept(DualStackPlainSocketImpl.java:135)
    at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409)
    at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:199)
    at java.net.ServerSocket.implAccept(ServerSocket.java:545)
    at java.net.ServerSocket.accept(ServerSocket.java:513)
    at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:164)
    ... 32 more

Driver stacktrace:
    at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1887)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1875)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1874)
    at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
    at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
    at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1874)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:926)
    at ...
    at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.spark.SparkException: Python worker failed to connect back.
    at ... java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    ... 1 more
Caused by: java.net.SocketTimeoutException: Accept timed out
    at java.net.DualStackPlainSocketImpl.waitForNewConnection(Native Method)
    at java.net.DualStackPlainSocketImpl.socketAccept(DualStackPlainSocketImpl.java:135)
    at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409)
    at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:199)
    at java.net.ServerSocket.implAccept(ServerSocket.java:545)
    at java.net.ServerSocket.accept(ServerSocket.java:513)
    at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:164)
    ... 32 more

该代码在Azure上有效,但在我的本地计算机上无效。我将超时参数增加到了上面代码的第一部分,但这样做没有帮助。我创建的数据帧甚至还没有那么大。

0 个答案:

没有答案