我需要将时间戳(movielens数据集)添加到斜率算法中,以使其更准确。
我以为我应该修改mahout-core / ... cf / taste / impl / recommender / slopeone中的源代码MemoryDiffStorage.java,但我不知道如何添加时间戳。
即使我在DataModle.java中找到了Long getPreferenceTime(long userID, long itemID)
这个定义,我仍然不知道它。
我应该说我是Java和Mahout的新手。请详细解释一下!谢谢〜:)
在下面添加修改后的代码:)原始代码在MemoryDiffStorage.java中,我只是插入一些语句。我想知道我是否插入错误的地方,但我认为这是它获取值的地方。
private long processOneUser(long averageCount, long userID) throws TasteException {
log.debug("Processing prefs for user {}", userID);
// Save off prefs for the life of this loop iteration
PreferenceArray userPreferences = dataModel.getPreferencesFromUser(userID);
int length = userPreferences.length();
for (int i = 0; i < length - 1; i++) {
float prefAValue = userPreferences.getValue(i);
long itemIDA = userPreferences.getItemID(i);
long Timestamp = dataModel.getPreferenceTime(userID, itemIDA);
long timestamp_max = 1046388675;
long timestamp_min = 975042787;
long t = (Timestamp - timestamp_min)/(timestamp_max - timestamp_min);
prefAValue = prefAValue * t;
FastByIDMap<RunningAverage> aMap = averageDiffs.get(itemIDA);
if (aMap == null) {
aMap = new FastByIDMap<RunningAverage>();
averageDiffs.put(itemIDA, aMap);
}
for (int j = i + 1; j < length; j++) {
// This is a performance-critical block
long itemIDB = userPreferences.getItemID(j);
RunningAverage average = aMap.get(itemIDB);
if (average == null && averageCount < maxEntries) {
average = buildRunningAverage();
aMap.put(itemIDB, average);
averageCount++;
}
if (average != null) {
average.addDatum(userPreferences.getValue(j) - prefAValue);
}
}