Question

我试图以快速有效的方式创建一个基于1xN矩阵的矩阵，以后用作scikit-learn训练中的特征。到目前为止，我一直尝试的很多事情之一是：

np.matrix(list(func(text) for text in data_test.data))

创建一个矩阵矩阵，如下所示：

matrix([[ <1x188796 sparse matrix of type '<type 'numpy.float64'>'
    with 10921 stored elements in Compressed Sparse Row format>,
         <1x188796 sparse matrix of type '<type 'numpy.float64'>'
    with 17651 stored elements in Compressed Sparse Row format>,
         <1x188796 sparse matrix of type '<type 'numpy.float64'>'
    with 28180 stored elements in Compressed Sparse Row format>,...

显然，这并不是我真正想要的。我怎样才能把它变成一个更合适的矩阵，如下：

<76002x108800 sparse matrix of type '<type 'numpy.float64'>'
with 807960 stored elements in Compressed Sparse Row format>

Answer 1

http://docs.scipy.org/doc/scipy-dev/reference/generated/scipy.sparse.vstack.html

怎么样？

如果速度太慢，请从此处采取快速路径：https://github.com/scipy/scipy/blob/master/scipy/sparse/construct.py#L396（在未来的Scipy版本中，vstack本身在这种情况下会很快。）

如何基于较小的矩阵创建矩阵？

1 个答案: