org.apache.spark.SparkException:由于数据块中的阶段故障,作业中止

时间:2020-11-02 11:18:23

标签: pyspark databricks

对不起,相同类型的问题。我在SO中看到太多关于阶段失败的帖子。但是这些都无法解决我的问题。所以我再次发布。 我在databricks,运行时7.3 LTS中运行。我有一个Spark数据框df2。我正在运行命令

ASP.abc_a_ascx1628|/wEdAEqgbfF05ggvbuBJkShRMuTf78aRWoCDixU4f5xZuwoiGiX5AzDOb/8AeBWl0clV3VND0U6Mhi/CYl90Bs4fMAg7x09VCodiazH2hPts

我收到以下错误消息。您能帮我解决问题吗?

df2.show()

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 231.0 failed 4   times, most recent failure: Lost task 0.3 in stage 231.0 (TID 6106, 10.52.98.16, executor 0): com.databricks.sql.io.FileReadException: Error while reading file  dbfs:/user/hive/warehouse/p_suggestedpricefornegotiation/part-00000-e64f3491-8afe-44a9-a55d-3495bc7a1395-c000.snappy.parquet. A file referenced in the transaction log cannot be found. This occurs when data has been manually deleted from the file system rather than using the table `DELETE` statement. For more information, see https://docs.microsoft.com/azure/databricks/delta/delta-intro#frequently-asked-questions

0 个答案:

没有答案