使用Tensorflow进行基于矩阵分解的推荐

时间:2017-11-07 21:12:40

标签: python tensorflow deep-learning recommendation-engine matrix-factorization

我是新的张量Flow和探索使用张量流的推荐系统。我已经在github中验证了很少的示例代码,并且大多数情况与下面的内容相同,如下所示

https://github.com/songgc/TF-recomm/blob/master/svd_train_val.py

但问题是,如何在上面的代码中为用户U1选择最佳推荐?

如果有任何示例代码或方法,请分享。感谢

1 个答案:

答案 0 :(得分:0)

有点困难!基本上,当svd返回时,它会关闭会话,并且张量会丢失它们的值(您仍然保留图形)。有几个选择:

  1. 将模型保存到文件并稍后恢复;
  2. 不要将会话放在with tf.Session() as sess: ....块中,而是返回会话;
  3. with ...
  4. 内进行用户处理

    最差的选项是选项3:您应该单独训练您的模型使用它。最好的方法是在某处保存模型和权重,然后恢复会话。但是,一旦恢复它,您仍然会遇到如何使用此会话对象的问题。为了演示这一部分,我将使用选项3解决此问题,假设您知道如何恢复会话。

    def svd(train, test):
        samples_per_batch = len(train) // BATCH_SIZE
    
        iter_train = dataio.ShuffleIterator([train["user"],
                                         train["item"],
                                         train["rate"]],
                                        batch_size=BATCH_SIZE)
    
        iter_test = dataio.OneEpochIterator([test["user"],
                                         test["item"],
                                         test["rate"]],
                                        batch_size=-1)
    
        user_batch = tf.placeholder(tf.int32, shape=[None], name="id_user")
        item_batch = tf.placeholder(tf.int32, shape=[None], name="id_item")
        rate_batch = tf.placeholder(tf.float32, shape=[None])
    
        infer, regularizer = ops.inference_svd(user_batch, item_batch, user_num=USER_NUM, item_num=ITEM_NUM, dim=DIM,
                                           device=DEVICE)
        global_step = tf.contrib.framework.get_or_create_global_step()
        _, train_op = ops.optimization(infer, regularizer, rate_batch, learning_rate=0.001, reg=0.05, device=DEVICE)
    
        init_op = tf.global_variables_initializer()
        with tf.Session() as sess:
            sess.run(init_op)
            summary_writer = tf.summary.FileWriter(logdir="/tmp/svd/log", graph=sess.graph)
            print("{} {} {} {}".format("epoch", "train_error", "val_error", "elapsed_time"))
            errors = deque(maxlen=samples_per_batch)
            start = time.time()
            for i in range(EPOCH_MAX * samples_per_batch):
                users, items, rates = next(iter_train)
                _, pred_batch = sess.run([train_op, infer], feed_dict={user_batch: users, item_batch: items, rate_batch: rates})
                pred_batch = clip(pred_batch)
                errors.append(np.power(pred_batch - rates, 2))
                if i % samples_per_batch == 0:
                    train_err = np.sqrt(np.mean(errors))
                    test_err2 = np.array([])
                    for users, items, rates in iter_test:
                        pred_batch = sess.run(infer, feed_dict={user_batch: users,item_batch: items})
                        pred_batch = clip(pred_batch)
                        test_err2 = np.append(test_err2, np.power(pred_batch - rates, 2))
                    end = time.time()
                    test_err = np.sqrt(np.mean(test_err2))
                    print("{:3d} {:f} {:f} {:f}(s)".format(i // samples_per_batch, train_err, test_err, end - start))
                    train_err_summary = make_scalar_summary("training_error", train_err)
                    test_err_summary = make_scalar_summary("test_error", test_err)
                    summary_writer.add_summary(train_err_summary, i)
                    summary_writer.add_summary(test_err_summary, i)
                    start = end
    
            # Get the top rated movie for user #1 for every item in the set
            userNumber = 1
            user_prediction = sess.run(infer, feed_dict={user_batch: np.array([userNumber]), item_batch: np.array(range(ITEM_NUM))})
            # The index number is the same as the item number. Orders from lowest (least recommended)
            # to largeset
            index_rating_order = np.argsort(user_prediction)
    
            print "Top ten recommended items for user {} are".format(userNumber)
            print index_rating_order[-10:][::-1]  # at the end, reverse the list
    
            # If you want to include the score:
            items_to_choose = index_rating_order[-10:][::-1]
            for item, score in zip(items_to_choose, user_prediction[items_to_choose]):
                print "{}:  {}".format(item,score)
    

    我做的唯一更改从第一个评论行开始。再次强调,最佳实践是训练这个功能,但实际上要单独进行预测。