Question

我有这样的字典，

{(0, 1, 2, 3, 0, 0): 0, (19, 49, 0, 0, 0, 0): 12, (85, 1, 87, 0, 0, 0): 22, (78, 79, 80, 81, 0, 0): 20, (0, 17, 18, 19, 0, 0): 8, (24, 25, 26, 27, 0, 0): 6, (62, 63, 64, 65, 0, 0): 16}

如何将此转换为coo_matrix？我尝试了以下但我得到了Error: int object is not subscriptable

data,row, col = [], [], []
 for k, v in diction.items():
     r = int(k[0][1:])
     c = int(k[1][1:])
     data.append(v)
     row.append(r-1)
     col.append(c-1)
     # Create the COO-matrix
 coo = coo_matrix((data,(row,col)))

我需要这样做，因为LightFM.fit方法只将co matricies作为参数。

预期输出（库矩阵）：

(0, 1, 2, 3, 0, 0)      0
(19, 49, 0, 0, 0, 0)    12
(85, 1, 87, 0, 0, 0)    22

Answer 1

正如其他人在评论中指出的那样，coo_matrix()期望坐标位于 2维：rows和columns。 data值存储实际数据值，该值位于对应坐标中。这也反映在LightFM.fit()文档中。

这个概念可能不清楚，我会尝试做出另一个解释而不是文档中给出的解释：三个输入数据，行和列必须具有相同的长度;并且是一维的。

每个坐标通常分别通过索引 i 和 j ，行索引和列索引来引用，因为它们表示第i行和j'第列（álamatrix_row[i]和matrix_column[j]）。

借用coo_matrix()文档中的示例：

row  = np.array([0, 3, 1, 0])
col  = np.array([0, 3, 1, 2])
data = np.array([4, 5, 7, 9])

for value, i, j in zip(data, row, col):
    print("In the {}'th row, on the {}'th column, insert the value {}"
          .format(i, j, value))
print("All other values are 0, because it's sparse.")

coo_matrix((data, (row, col)), shape=(4, 4)).toarray()

输出：

In the 0'th row, on the 0'th column, insert the value 4
In the 3'th row, on the 3'th column, insert the value 5
In the 1'th row, on the 1'th column, insert the value 7
In the 0'th row, on the 2'th column, insert the value 9
All other values are 0, because it's sparse.

array([
   [4, 0, 9, 0],
   [0, 7, 0, 0],
   [0, 0, 0, 0],
   [0, 0, 0, 5]
])

关于您的代码的说明：

Error: int object is not subscriptable错误可能来自您的代码，您尝试下标k，这是您的密钥，例如您的第一个k将是(0, 1, 2, 3, 0, 0)。

当您执行r=int(k[0][1:])时，请尝试获取0[1:]（因为k中的零条目是0。类似于c = int(k[1][1:])，{{1 } k[1]是1，因此k[1][1:]正在尝试1[1:]。

此外，执行int()也行不通。如果您要执行的操作是转换列表中的每个元素，请使用numpy.array.astype()。例如。 np.array([1.2, 3, 4.4]).astype(int)将array([1, 3, 4])。

将字典转换为coo_matrix

1 个答案: