我试图运行mahout框架并在项目集上使用Tanimoto系数。幸运的是,它适用于我,但它为所有预测项返回值1.0,代码如下:
public static void main(String[] args) throws Exception {
DataModel model = new FileDataModel(new File("stack.csv")); //load data from file needed for computation
UserSimilarity similarity = new TanimotoCoefficientSimilarity(model); //log likelihood similarity will be used for making recommendation .
/*To use TanimotoCoefficientSimilarity replace “LogLikelihoodSimilarity” with TanimotoCoefficientSimilarity”.
UserSimilarity implementation provides how similar two two users are using LoglikehoodSimilarity */
UserNeighborhood neighborhood = new NearestNUserNeighborhood(2, similarity, model); //Define a group of user most similar to a given user . 2 define a group of 2 user having most similar preference
Recommender recommender = new GenericUserBasedRecommender( model, neighborhood, similarity); // creates a recommendation engine
List<RecommendedItem>recommendations = recommender.recommend(3, 5);
/*one recommendation for user with ID 4 . In Mahout it always take Integer value i.e It will always take userId and number of item to be recommended */
for (RecommendedItem recommendation : recommendations) {
System.out.println(recommendation);
}
}
输出如下:
[main] INFO org.apache.mahout.cf.taste.impl.model.file.FileDataModel - Creating FileDataModel for file stack.csv
[main] INFO org.apache.mahout.cf.taste.impl.model.file.FileDataModel - Reading file info...
[main] INFO org.apache.mahout.cf.taste.impl.model.file.FileDataModel - Read lines: 696
RecommendedItem[item:589, value:1.0]
RecommendedItem[item:380, value:1.0]
RecommendedItem[item:2916, value:1.0]
RecommendedItem[item:3107, value:1.0]
RecommendedItem[item:2028, value:1.0]
Part of my data file is as follow:
1 3408
1 595
1 2398
1 2918
1 2791
1 2687
1 3105
.
.
.
据我所知,Tanimoto Coefficient值通常介于0和1.0之间,但这里只显示1.0,这是我认为不可能实现的。所以,任何人都有任何想法如何解决这个问题?我有什么门槛可以改变吗?
非常感谢任何帮助。
非常感谢提前。
答案 0 :(得分:1)
Tanimoto系数,或者也称为Jaccard系数,完全忽略了偏好值,只是认为用户喜欢这个项目,仅此而已。如何计算?最终值是两个用户表示某些偏好(换句话说仅仅是喜欢)的项目数除以用户表达某些偏好的项目数。
在此处阅读有关Jaccard系数的更多信息:reference docs
在http://en.wikipedia.org/wiki/Jaccard_index一书中详细了解Mahout的实施
。