Spark LDA - 工作者的本地磁盘

时间:2015-07-13 08:44:56

标签: scala apache-spark lda

当我使用Scala脚本运行LDA模型时,如果迭代次数很多,它将以“没有剩余空间”的错误终止。

我用以下脚本检查空间量:

val perNodeSpaceInGB = sc.parallelize(0 to 100).map { _ =>
val hostname = ("hostname".!!).trim
val spaceInGB = ("df /local_disk".!!).split(" +")(9).toInt / 1024 / 1024
//System.gc()
(hostname, spaceInGB)
}.collect.distinct
println(f"There are ${perNodeSpaceInGB.size} nodes in this cluster. Per node free space (in GB):\n--------------------------------------")
perNodeSpaceInGB.foreach{case (a, b) => println(f"$a\t\t$b%2.2f")}
val totalSpaceInGB = perNodeSpaceInGB.map(_._2).sum

并且看到自由空间的数量逐渐减少,直到零并终止。好像有些临时文件没有及时删除。检查点设置为每10次迭代。

任何提示?

错误:

org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 885.0 failed 4 times, most recent failure: Lost task 2.3 in stage 885.0 (TID 586, 10.0.239.157): java.io.IOException: No space left on device

0 个答案:

没有答案