Question

我有3个稀疏矩阵：

In [39]:

mat1


Out[39]:
(1, 878049)
<1x878049 sparse matrix of type '<type 'numpy.int64'>'
    with 878048 stored elements in Compressed Sparse Row format>

In [37]:

mat2


Out[37]:
(1, 878049)
<1x878049 sparse matrix of type '<type 'numpy.int64'>'
    with 744315 stored elements in Compressed Sparse Row format>

In [35]:

mat3



Out[35]:
(1, 878049)
<1x878049 sparse matrix of type '<type 'numpy.int64'>'
    with 788618 stored elements in Compressed Sparse Row format>

从documentation开始，我读到可以hstack，vstack和concatenate这种类型的矩阵。所以我试着hstack他们：

import numpy as np

matrix1 = np.hstack([[address_feature, dayweek_feature]]).T
matrix2 = np.vstack([[matrix1, pddis_feature]]).T


X = matrix2

但是，尺寸不匹配：

In [41]:

X_combined_features.shape

Out[41]:

(2, 1)

请注意，我正在堆叠这样的矩阵，因为我想将它们与scikit-learn分类算法一起使用。因此，我应该hstack多个不同的稀疏矩阵？。

Answer 1

使用sparse的{{1}}版本。作为一般规则，您需要使用稀疏函数和方法，而不是具有相似名称的vstack稀疏函数和方法。 numpy矩阵不是sparse numpy的子类。

但是，你的3个三个矩阵看起来并不稀疏。它们是1x878049。一个有878048个非零元素 - 这意味着只有一个0元素。

所以你也可以将它们变成密集阵列（ndarray或.toarray()）并使用.A或np.hstack。

np.vstack

不要使用双括号。所有连接函数都采用数组的简单列表或元组。该列表可以有两个以上的数组

np.hstack([address_feature.A, dayweek_feature.A])

如何隐藏几个稀疏矩阵（特征矩阵）？

1 个答案: