我有MatrixFactorizationModel对象。如果我在通过ALS.train(...)构建模型后立即向单个用户推荐产品,则需要300毫秒(对于我的数据和硬件)。但是,如果我将模型保存到磁盘并加载回来,那么推荐需要大约2000毫秒。 Spark警告:
15/07/17 11:05:47 WARN MatrixFactorizationModel: User factor does not have a partitioner. Prediction on individual records could be slow.
15/07/17 11:05:47 WARN MatrixFactorizationModel: User factor is not cached. Prediction could be slow.
15/07/17 11:05:47 WARN MatrixFactorizationModel: Product factor does not have a partitioner. Prediction on individual records could be slow.
15/07/17 11:05:47 WARN MatrixFactorizationModel: Product factor is not cached. Prediction could be slow.
如何在加载模型后创建/设置分区程序并缓存用户和产品因素?以下方法没有帮助:
model.userFeatures().cache();
model.productFeatures().cache();
此外,我试图重新分配这些rdds并从重新分区版本创建新模型,但这也没有帮助。
答案 0 :(得分:2)
您不必使用括号,userFeatures是(Int,Array [Double])的RDD,它不带参数。
这将对您有所帮助:
model.userFeatures.cache
model.productFeatures.cache