我正在尝试使用Apache Mahout生成推荐,同时使用MongoDB根据MongoDBDataModel创建数据模型。我的代码如下:
import java.net.UnknownHostException;
import java.util.List;
import org.apache.mahout.cf.taste.common.TasteException;
import org.apache.mahout.cf.taste.impl.model.mongodb.MongoDBDataModel;
import org.apache.mahout.cf.taste.impl.neighborhood.ThresholdUserNeighborhood;
import org.apache.mahout.cf.taste.impl.recommender.GenericItemBasedRecommender;
import org.apache.mahout.cf.taste.impl.recommender.GenericUserBasedRecommender;
import org.apache.mahout.cf.taste.impl.similarity.PearsonCorrelationSimilarity;
import org.apache.mahout.cf.taste.neighborhood.UserNeighborhood;
import org.apache.mahout.cf.taste.recommender.RecommendedItem;
import org.apache.mahout.cf.taste.recommender.UserBasedRecommender;
import org.apache.mahout.cf.taste.similarity.ItemSimilarity;
import org.apache.mahout.cf.taste.similarity.UserSimilarity;
import com.mongodb.MongoException;
public class usingMongo {
public static void main(String[] args) throws UnknownHostException, Mong oException
,TasteException {
final long startTime = System.nanoTime();
MongoDBDataModel model = new MongoDBDataModel("AdamsLaptop", 27017,
"test", "ratings100k", false, false, null);
System.out.println("connected to mongo ");
UserSimilarity UserSim = new PearsonCorrelationSimilarity(model);
UserNeighborhood neighborhood = new ThresholdUserNeighborhood(0.5, UserSim, model);
UserBasedRecommender UserRecommender = new GenericUserBasedRecommender(model, neighborhood, UserSim);
List<RecommendedItem>UserRecommendations = UserRecommender.recommend(1, 3);
for (RecommendedItem recommendation : UserRecommendations) {
System.out.println("You may like movie " + recommendation.getItemID() + " as a user similar to you also rated it " + recommendation.getValue() + " USER");
}
ItemSimilarity ItemSim = new PearsonCorrelationSimilarity(model);//LogLikelihoodSimilarity(model);
GenericItemBasedRecommender ItemRecommender = new GenericItemBasedRecommender(model, ItemSim);
List<RecommendedItem>ItemRecommendations = ItemRecommender.recommend(1, 3);
for (RecommendedItem recommendation : ItemRecommendations) {
System.out.println("You may like movie " + recommendation.getItemID() + " as a user similar to you also rated it " + recommendation.getValue() + " ITEM");
}
final long duration = System.nanoTime() - startTime;
System.out.println(duration);
}
}
我无法看到我出错的地方,但是经过大量的更改和大量的反复试验,错误信息保持不变:
Exception in thread "main" java.lang.NullPointerException
at org.apache.mahout.cf.taste.impl.model.mongodb.MongoDBDataModel.getID(MongoDBDataModel.java:743)
at org.apache.mahout.cf.taste.impl.model.mongodb.MongoDBDataModel.buildModel(MongoDBDataModel.java:570)
at org.apache.mahout.cf.taste.impl.model.mongodb.MongoDBDataModel.<init>(MongoDBDataModel.java:245)
at recommender.usingMongo.main(usingMongo.java:24)
有什么建议吗?以下是MongoDB中我的数据示例:
{ "_id" : ObjectId("56ddf61f5960960c333f3dcb"),"userId" : 1, "movieId" : 292, "rating" : 4, "timestamp" : 847116936 }
答案 0 :(得分:0)
我成功地将MongoDB数据集成到了mahout。
mongoDB中数据的结构取决于您使用的相似度算法的类型。例如,
<强> UserSimilarity 强>
MongoDBDataModel datamodel = new MongoDBDataModel(“127.0.0.1”,27017,“testing”,“rating”,true,true,null); 其中user_id,item_id是整数值,preference是float值,created_at是timestamp
<强> SVDRecommender 强>
user_id,item_id是MongoDB对象,首选项是浮点值,created_at是timestamp
您可以做的明显的故障排除是MongoDB服务器是否正在运行。正如它正在运行的例外。我认为问题在于你的数据结构..
使用user_id而不是userId,item_id而不是itemId,而不是rating。我不知道这是否会有所不同。我在线使用了其中一个教程,但目前无法找到它。
当我有超过10000名拥有1000个项目的用户时,它正在工作但速度太慢。
答案 1 :(得分:0)
我认为问题在于mahout假设某些默认值需要驻留在mongoDB中的某些字段,项目ID,用户ID和首选项是user_id,item_id和首选项。所以解决方案可能在于使用另一个MongoDBDataModel构造函数,它使您可以在mongoDB实例中作为参数传递这些字段的名称或重新设计集合架构。
我希望这是有道理的。