Question

对于作业，我需要创建一个使用提供的损失功能的电影推荐系统：

sum(?i=1,M) ?sum(j=1,M) indicator[i̸=j](viT vj − Xi,j )**2

这意味着，两个电影嵌入Vi和Vj之间的点积应接近Xi，j。其中Xi，j是同时喜欢电影i和电影j的用户总数。指标函数忽略了i == j的条目（设置为0。）

可交付成果是来自隐藏层的权重矩阵。它的尺寸应为9724x300，其中包含9724个唯一的电影ID和300个神经元。 300是一个任意选择，受Google word2vec中300个神经元的使用影响。

我所拥有的：

source_data：行是用户，列是电影。给定单元格中的1表示相应的用户喜欢相应的电影（不喜欢= 0。）
preprocessed_data：source_data自身转置的矩阵乘法。（即，每个单元格都是喜欢电影i和j的用户的总和。对角线条目对我来说没有用，因为它们只是喜欢一部电影的用户的总和。）

我被困的地方：

不确定如何根据i和j定义自己的损失函数，该函数可以将喜欢preprocessed_data的电影i和j的用户总数与隐藏层权重张量的行i和j进行比较。
不确定如何将我的数据组织为X和y张量，以使数据适合损失函数。

Answer 1

在继续阅读之前，请注意，在StackOverflow上寻求和获得直接帮助可能违反您学校的规定，并给您作为学生的后果！

话虽如此，我对这个问题建模的方式如下：

import torch

U = 300 # number of users
M = 30  # number of movies
D = 4   # dimension of embedding vectors

source = torch.randint(0, 2, (U, M)) # users' ratings
X = source.transpose(0, 1) @ source  # your `preprocessed_data`

# initial values for your embedding. This is what your algorithm needs to learn
v = torch.randn(M, D, requires_grad=True)
X = X.to(torch.float32) # necessary to be in line with `v`

# this is the `(viT vj − Xi,j )**2` part
loss_elementwise = (v @ v.transpose(0, 1) - X).pow(2)

# now we need to get rid of the diagonal. Notice that we can equally
# well get rid of the diagonal and the whole upper triangular part,
# as well, since both V @ V.T and source.T @ source are symmetric, so
# the upper triangular part contains just
# a mirror reflection of the lower triangular part.
# This means that we actually implement a bit different summation:
# sum(i=1,M) sum(j=1,i-1) stuff(i, j)
# instead of
# sum(i=1,M) sum(j=1,M) indicator[i̸=j] stuff(i, j)
# and get exactly half the original value
masked = torch.tril(loss_elementwise, -1)

# finally we sum it up, multiplying by 2 to make up
# for the "lost" upper triangular part
loss = 2 * masked.sum()

现在剩下要实现的是优化循环，该循环将使用loss的梯度来优化v的值。

引用模型参数的PyTorch损失函数

1 个答案: