Question

我正在使用Zeppelin v0.7.3笔记本运行Pyspark脚本。在一个段落中，我正在运行脚本以将数据从dataframe写入Blob文件夹中的parquet文件。文件按国家/地区分区。数据帧的行数为99,452,829。脚本到达1 hour时，遇到错误-

400 StatusCode错误：“要求失败：会话未活跃。

我笔记本的默认解释器是jdbc。我已经读过timeoutlifecyclemanager，并在解释器设置zeppelin.interpreter.lifecyclemanager.timeout.threshold中添加了它，并将其设置为7200000，但是在达到1小时运行时间（33％处理完成）后仍然遇到错误。

1小时超时后，我检查了Blob文件夹，并将镶木地板文件成功写入了Blob，并且确实按国家/地区进行了分区。

我正在运行的将DF写入镶木地板Blob的脚本如下：

trdpn_cntry_fct_denom_df.write.format("parquet").partitionBy("CNTRY_ID").mode("overwrite").save("wasbs://tradepanelpoc@blobasbackupx2066561.blob.core.windows.net/cbls/hdi/trdpn_cntry_fct_denom_df.parquet")

这是Zeppelin超时问题吗？如何将其扩展超过1小时的运行时间？感谢您的帮助。

Answer 1

自版本0.8开始，可以使用超时生命周期管理器。

似乎pyspark存在问题。试试这个解决方案 Pyspark socket timeout exception after application running for a while

Answer 2

来自This stack overflow question's answer which worked for me

根据输出判断，如果您的应用程序未以FAILED状态完成，则听起来像是Livy超时错误：您的应用程序可能比Livy会话所定义的超时时间更长（默认为1h），所以即使尽管Spark应用程序成功运行，但如果该应用程序花费的时间比Livy会话的超时时间长，则您的笔记本仍会收到此错误。

如果是这种情况，请按以下步骤处理：

1. edit the /etc/livy/conf/livy.conf file (in the cluster's master node)
2. set the livy.server.session.timeout to a higher value, like 8h (or larger, depending on your app)
3. restart Livy to update the setting: sudo restart livy-server in the cluster's master
4. test your code again

超时错误：400 StatusCode错误：“要求失败：会话未激活。”

2 个答案: