使用MatrixFactorizationModel.recommendProductsForUsers(num:Int)的Spark内存泄漏

时间:2016-11-09 07:02:58

标签: apache-spark memory-leaks matrix-factorization

我正在准备基于协作过滤的用户推荐。为此,我使用了org.apache.spark.mllib.recommendation._。 但是当我运行大约7GB的工作时,由于火花塞泄漏,我遇到了工作失败。但是我得到了小尺寸数据集的结果。

ALS模型配置:

  • 排名50
  • iterations 15
  • lamdha 0.25
  • alpha 300

Spark Cluster详细信息:

  • nodes 9
  • 内存每个节点64 GB
  • 每个节点4个核心

运行配置:

--num-executors 16 --executor-cores 2 --executor-memory 32G

代码:

def generateTopKProductRecommendations(topK : Int = 20: RDD[Row] = {
       model.recommendProductsForUsers(topK)
      .map{ r => Row(r._1, r._2.map(x => x.product).toSeq)}
}

即使我尝试使用

进行调试
model.recommendProductsForUsers(topK)
      .map{ r => Row(r._1)}

YARN log:

16/11/08 06:33:48 ERROR executor.Executor: Managed memory leak detected; size = 67108864 bytes, TID = 8796
16/11/08 06:33:48 ERROR executor.Executor: Managed memory leak detected; size = 67108864 bytes, TID = 8791
16/11/08 06:33:48 ERROR storage.ShuffleBlockFetcherIterator: Failed to get block(s) from xxx.xxx.xxx.xxx:42340
java.io.IOException: Failed to connect to xxx.xxx.xxx.xxx:42340
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:193)
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:156)......

16/11/08 06:33:48 ERROR storage.ShuffleBlockFetcherIterator: Failed to get block(s) from xxx.xxx.xxx.xxx:42340
java.io.IOException: Failed to connect to xxx.xxx.xxx.xxx:42340
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:193)
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:156)
    at org.apache.spark.network.netty.NettyBlockTransferService$$anon$1.createAndStart(NettyBlockTransferService.scala:88)

0 个答案:

没有答案