了解协作过滤的交替最小二乘

时间:2016-04-04 14:27:56

标签: python recommendation-engine least-squares collaborative-filtering

过去几天我一直在搞乱推荐引擎,并且遇到了这个非常好的教程,它演示了在协作过滤器中使用交替最小二乘法:http://bugra.github.io/work/notes/2014-04-19/alternating-least-squares-method-for-collaborative-filtering/

我设法按照说明直到最后一步。这是作者编写代码以打印推荐的部分。代码段如下: -

def print_recommendations(W=W, Q=Q, Q_hat=Q_hat, movie_titles=movie_titles):
  Q_hat -= np.min(Q_hat)
  Q_hat *= float(5) / np.max(Q_hat)
  movie_ids = np.argmax(Q_hat - 5 * W, axis=1)
  for jj, movie_id in zip(range(m), movie_ids):

  print('User {} liked {}\n'.format(jj + 1, ', '.join([movie_titles[ii] for ii, qq in enumerate(Q[jj]) if qq > 3])))

  print('\n User {} recommended movie is {} - with predicted rating: {}'.format( jj + 1, movie_titles[movie_id], Q_hat[jj, movie_id]))

  print('\n' + 100 *  '-' + '\n')

在这个片段中,W是权重矩阵。 Q矩阵用于形式化评级所衡量的置信度概念。因此:

Q = 1 if user u rated item i

Q= 0  if user u did not rate item i

Q_是在指定的迭代次数之后实现ALS算法后获得的新矩阵。

我无法理解为什么作者特别实现了这两个步骤:

Q_hat -= np.min(Q_hat)
Q_hat *= float(5) / np.max(Q_hat)

有人可以指导我并帮助我理解这一点吗?我真的很感激。

由于

编辑:以下是原始函数的主要链接:https://gist.github.com/arjun180/71124392b0b70f7b96a8826b59400b99

1 个答案:

答案 0 :(得分:1)

This is a normalization of the predicted ratings.

Q_hat -= np.min(Q_hat)

Here the author is subtracting the smallest in the predicted ratings matrix to all predicted values.

This guarantees that all predicted ratings start at 0.

Q_hat *= float(5) / np.max(Q_hat)

Here the author is normalizing the predicted ratings to range from 0 up to 5.