Windows中的Spark异常:java.io.IOException:无法重命名shuffle文件

时间:2018-02-20 20:04:20

标签: apache-spark apache-spark-sql spark-dataframe spark-streaming

我在尝试运行一段Spark Streaming代码时遇到了一个问题。我试图从kafka主题读取数据,然后将处理后的数据推送到弹性搜索。我在Windows上的eclipse中运行此代码,并配置了Kafka,Spark,zookeeper和elasticsearch。我收到以下错误:

  18/02/20 14:52:11 ERROR Executor: Exception in task 0.0 in stage 6.0 (TID 5)
  java.io.IOException: fail to rename file C:\Users\shash\AppData\Local\Temp\blockmgr-cb45497b-7f85-4158-815b-852edecbb2c5\0f\shuffle_1_0_0.index.ca3b55d2-6c26-4798-a17b-21a42f099126 to C:\Users\shash\AppData\Local\Temp\blockmgr-cb45497b-7f85-4158-815b-852edecbb2c5\0f\shuffle_1_0_0.index
at org.apache.spark.shuffle.IndexShuffleBlockResolver.writeIndexFileAndCommit(IndexShuffleBlockResolver.scala:178)
at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:72)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
at org.apache.spark.scheduler.Task.run(Task.scala:85)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  at java.lang.Thread.run(Thread.java:745)
  18/02/20 14:52:11 WARN TaskSetManager: Lost task 0.0 in stage 6.0 (TID 5, localhost): java.io.IOException: fail to rename file C:\Users\shash\AppData\Local\Temp\blockmgr-cb45497b-7f85-4158-815b-852edecbb2c5\0f\shuffle_1_0_0.index.ca3b55d2-6c26-4798-a17b-21a42f099126 to C:\Users\shash\AppData\Local\Temp\blockmgr-cb45497b-7f85-4158-815b-852edecbb2c5\0f\shuffle_1_0_0.index
at org.apache.spark.shuffle.IndexShuffleBlockResolver.writeIndexFileAndCommit(IndexShuffleBlockResolver.scala:178)
at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:72)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
at org.apache.spark.scheduler.Task.run(Task.scala:85)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

有人可以就如何解决此问题向我提供一些意见吗?

1 个答案:

答案 0 :(得分:3)

我通过将spark.local.dir设置为您具有重命名权限的另一个路径来解决此问题。

SparkConf conf = new SparkConf().set("spark.local.dir","another path");

我是新手,希望它适合你。