scipy.sparse.hstack(([1],[2])) - > “ValueError:blocks必须是2-D”。为什么?

时间:2015-08-09 03:21:55

标签: python scipy sparse-matrix

scipy.sparse.hstack((1, [2]))scipy.sparse.hstack((1, [2]))效果很好,但不是scipy.sparse.hstack(([1], [2]))。为什么会这样?

以下是我系统上发生的事情:

C:\Anaconda>python
Python 2.7.10 |Anaconda 2.3.0 (64-bit)| (default, May 28 2015, 16:44:52) [MSC v.
1500 64 bit (AMD64)] on win32
>>> import scipy.sparse
>>> scipy.sparse.hstack((1, [2]))
<1x2 sparse matrix of type '<type 'numpy.int32'>'
        with 2 stored elements in COOrdinate format>
>>> scipy.sparse.hstack((1, 2))
<1x2 sparse matrix of type '<type 'numpy.int32'>'
        with 2 stored elements in COOrdinate format>
>>> scipy.sparse.hstack(([1], [2]))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Anaconda\lib\site-packages\scipy\sparse\construct.py", line 456, in h
stack
    return bmat([blocks], format=format, dtype=dtype)
  File "C:\Anaconda\lib\site-packages\scipy\sparse\construct.py", line 539, in b
mat
    raise ValueError('blocks must be 2-D')
ValueError: blocks must be 2-D
>>> scipy.version.full_version
'0.16.0'
>>>

2 个答案:

答案 0 :(得分:7)

fib的第一种情况下,数字1被解释为标量值,数字2被解释为密集矩阵,因此当您将这两个事物组合在一起时,数据类型被强制转换为它们都是标量,你可以将它与import asyncio from functools import partial from concurrent.futures import ProcessPoolExecutor def fib(n): if n < 1: return 1 a = fib(n-1) b = fib(n-2) return a + b def do_it(writer, result): writer.write('{}\n'.format(result.result()).encode('ascii')) asyncio.async(writer.drain()) @asyncio.coroutine def fib_handler(reader, writer): print('Connection from : {}'.format(writer.transport.get_extra_info('peername'))) executor = ProcessPoolExecutor(8) # 8 Processes in the pool loop = asyncio.get_event_loop() while True: req = yield from reader.readline() if not req: break print(req) n = int(req) result = loop.run_in_executor(executor, fib, n) result.add_done_callback(partial(do_it, writer)) writer.close() print("Closed") 正常结合起来。

以下是一些测试,表明多个值都是如此:

scipy.sparse.hstack((1, [2]))

正如您所看到的,如果scipy.sparse.hstack中至少有一个标量存在,这似乎有效。

但是,如果尝试这样做的第二种情况下In [31]: scipy.sparse.hstack((1,2,[3],[4])) Out[31]: <1x4 sparse matrix of type '<type 'numpy.int64'>' with 4 stored elements in COOrdinate format> In [32]: scipy.sparse.hstack((1,2,[3],[4],5,6)) Out[32]: <1x6 sparse matrix of type '<type 'numpy.int64'>' with 6 stored elements in COOrdinate format> In [33]: scipy.sparse.hstack((1,[2],[3],[4],5,[6],7)) Out[33]: <1x7 sparse matrix of type '<type 'numpy.int64'>' ,它们不同时为标量不再和这些都是致密的基质,并且可以不使用hstack与纯粹稠密矩阵。

重现:

scipy.sparse.hstack(([1],[2]))

请参阅此帖子以获取更多信息:Scipy error with sparse hstack

因此,如果你想成功地使用两个矩阵,你必须首先使它们稀疏,然后将它们组合起来:

scipy.sparse.hstack

有趣的是,如果您尝试使用密集版本的In [34]: scipy.sparse.hstack(([1],[2])) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-45-cd79952b2e14> in <module>() ----> 1 scipy.sparse.hstack(([1],[2])) /usr/local/lib/python2.7/site-packages/scipy/sparse/construct.pyc in hstack(blocks, format, dtype) 451 452 """ --> 453 return bmat([blocks], format=format, dtype=dtype) 454 455 /usr/local/lib/python2.7/site-packages/scipy/sparse/construct.pyc in bmat(blocks, format, dtype) 531 532 if blocks.ndim != 2: --> 533 raise ValueError('blocks must be 2-D') 534 535 M,N = blocks.shape ValueError: blocks must be 2-D In [36]: A = scipy.sparse.coo_matrix([1]) In [37]: B = scipy.sparse.coo_matrix([2]) In [38]: C = scipy.sparse.hstack([A, B]) In [39]: C Out[39]: <1x2 sparse matrix of type '<type 'numpy.int64'>' with 2 stored elements in COOrdinate format> 执行的操作,那么这是完全可以接受的:

hstack

....对于稀疏矩阵表示numpy.hstack,事情很糟糕。

答案 1 :(得分:3)

编码细节如下:

def hstack(blocks ...):
    return bmat([blocks], ...)

def bmat(blocks, ...):
    blocks = np.asarray(blocks, dtype='object')
    if blocks.ndim != 2:
        raise ValueError('blocks must be 2-D')
    (continue)

所以尝试你的替代方案(记住额外的[]):

In [392]: np.asarray([(1,2)],dtype=object)
Out[392]: array([[1, 2]], dtype=object)

In [393]: np.asarray([(1,[2])],dtype=object)
Out[393]: array([[1, [2]]], dtype=object)

In [394]: np.asarray([([1],[2])],dtype=object)
Out[394]: 
array([[[1],
        [2]]], dtype=object)

In [395]: _.shape
Out[395]: (1, 2, 1)

最后一种情况(您的问题案例)失败,因为结果是3d。

使用2个稀疏矩阵(预期输入):

In [402]: np.asarray([[a,a]], dtype=object) 
Out[402]: 
array([[ <1x1 sparse matrix of type '<class 'numpy.int32'>'
    with 1 stored elements in COOrdinate format>,
        <1x1 sparse matrix of type '<class 'numpy.int32'>'
    with 1 stored elements in COOrdinate format>]], dtype=object)

In [403]: _.shape
Out[403]: (1, 2)

hstack正在利用bmat格式,将矩阵列表转换为嵌套(2d)矩阵列表。 bmat意味着将二维稀疏矩阵数组合成一个较大的矩阵。跳过首先制作这些稀疏矩阵的步骤可能会或可能不会起作用。代码和文档没有做出任何承诺。