Numpy:x和y数组的笛卡尔积指向单个2D点阵列

时间:2012-06-21 18:30:19

标签: python numpy cartesian-product

我有两个numpy数组,用于定义网格的x和y轴。例如:

x = numpy.array([1,2,3])
y = numpy.array([4,5])

我想生成这些数组的笛卡尔积来生成:

array([[1,4],[2,4],[3,4],[1,5],[2,5],[3,5]])

由于我需要在循环中多次执行此操作,因此效率不是非常低效。我假设将它们转换为Python列表并使用itertools.product并返回到numpy数组并不是最有效的形式。

13 个答案:

答案 0 :(得分:124)

规范cartesian_product(差不多)

对于具有不同属性的此问题,有许多方法。有些比其他更快,有些更通用。经过大量的测试和调整后,我发现计算n维cartesian_product的以下函数对于许多输入来说比其他函数更快。对于稍微复杂一些但在许多情况下甚至更快一些的方法,请参阅Paul Panzer的答案。

鉴于答案,这不再是我所知道的numpy中笛卡尔积的最快实现。但是,我认为它的简单性将继续使其成为未来改进的有用基准:

def cartesian_product(*arrays):
    la = len(arrays)
    dtype = numpy.result_type(*arrays)
    arr = numpy.empty([len(a) for a in arrays] + [la], dtype=dtype)
    for i, a in enumerate(numpy.ix_(*arrays)):
        arr[...,i] = a
    return arr.reshape(-1, la)

值得一提的是,此功能以不寻常的方式使用ix_;虽然ix_的记录使用是generate indices 数组,但恰好可以使用具有相同形状的数组进行广播分配。非常感谢mgilson,我鼓励我尝试以这种方式使用ix_,感谢unutbu,他们就此答案提供了一些非常有用的反馈,包括使用{{3}的建议}。

值得注意的替代方案

以Fortran顺序编写连续的内存块有时会更快。这是此备选方案cartesian_product_transpose的基础,在某些硬件上证明比cartesian_product更快(见下文)。然而,使用相同原理的Paul Panzer的答案甚至更快。不过,我还是在这里为感兴趣的读者提供了这个:

def cartesian_product_transpose(*arrays):
    broadcastable = numpy.ix_(*arrays)
    broadcasted = numpy.broadcast_arrays(*broadcastable)
    rows, cols = numpy.prod(broadcasted[0].shape), len(broadcasted)
    dtype = numpy.result_type(*arrays)

    out = numpy.empty(rows * cols, dtype=dtype)
    start, end = 0, rows
    for a in broadcasted:
        out[start:end] = a.reshape(-1)
        start, end = end, end + rows
    return out.reshape(cols, rows).T

在了解了Panzer的方法后,我写了一个几乎和他一样快的新版本,几乎和cartesian_product一样简单:

def cartesian_product_simple_transpose(arrays):
    la = len(arrays)
    dtype = numpy.result_type(*arrays)
    arr = numpy.empty([la] + [len(a) for a in arrays], dtype=dtype)
    for i, a in enumerate(numpy.ix_(*arrays)):
        arr[i, ...] = a
    return arr.reshape(la, -1).T

这似乎有一些固定时间开销,使得它比小型输入的Panzer运行速度慢。但是对于更大的输入,在我运行的所有测试中,它的表现与他最快的实现(cartesian_product_transpose_pp)一样好。

在以下部分中,我将介绍其他替代方案的一些测试。这些现在已经过时了,但我没有重复努力,而是决定将它们留在这里出于历史兴趣。有关最新测试,请参阅Panzer的答案以及numpy.result_type

对替代方案的测试

以下是一系列测试,显示其中一些功能相对于许多替代方案提供的性能提升。此处显示的所有测试均在四核机器上执行,运行Mac OS 10.12.5,Python 3.6.1和numpy 1.12.1。已知硬件和软件的变化产生不同的结果,因此YMMV。为自己运行这些测试以确保!

定义:

import numpy
import itertools
from functools import reduce

### Two-dimensional products ###

def repeat_product(x, y):
    return numpy.transpose([numpy.tile(x, len(y)), 
                            numpy.repeat(y, len(x))])

def dstack_product(x, y):
    return numpy.dstack(numpy.meshgrid(x, y)).reshape(-1, 2)

### Generalized N-dimensional products ###

def cartesian_product(*arrays):
    la = len(arrays)
    dtype = numpy.result_type(*arrays)
    arr = numpy.empty([len(a) for a in arrays] + [la], dtype=dtype)
    for i, a in enumerate(numpy.ix_(*arrays)):
        arr[...,i] = a
    return arr.reshape(-1, la)

def cartesian_product_transpose(*arrays):
    broadcastable = numpy.ix_(*arrays)
    broadcasted = numpy.broadcast_arrays(*broadcastable)
    rows, cols = numpy.prod(broadcasted[0].shape), len(broadcasted)
    dtype = numpy.result_type(*arrays)

    out = numpy.empty(rows * cols, dtype=dtype)
    start, end = 0, rows
    for a in broadcasted:
        out[start:end] = a.reshape(-1)
        start, end = end, end + rows
    return out.reshape(cols, rows).T

# from https://stackoverflow.com/a/1235363/577088

def cartesian_product_recursive(*arrays, out=None):
    arrays = [numpy.asarray(x) for x in arrays]
    dtype = arrays[0].dtype

    n = numpy.prod([x.size for x in arrays])
    if out is None:
        out = numpy.zeros([n, len(arrays)], dtype=dtype)

    m = n // arrays[0].size
    out[:,0] = numpy.repeat(arrays[0], m)
    if arrays[1:]:
        cartesian_product_recursive(arrays[1:], out=out[0:m,1:])
        for j in range(1, arrays[0].size):
            out[j*m:(j+1)*m,1:] = out[0:m,1:]
    return out

def cartesian_product_itertools(*arrays):
    return numpy.array(list(itertools.product(*arrays)))

### Test code ###

name_func = [('repeat_product',                                                 
              repeat_product),                                                  
             ('dstack_product',                                                 
              dstack_product),                                                  
             ('cartesian_product',                                              
              cartesian_product),                                               
             ('cartesian_product_transpose',                                    
              cartesian_product_transpose),                                     
             ('cartesian_product_recursive',                           
              cartesian_product_recursive),                            
             ('cartesian_product_itertools',                                    
              cartesian_product_itertools)]

def test(in_arrays, test_funcs):
    global func
    global arrays
    arrays = in_arrays
    for name, func in test_funcs:
        print('{}:'.format(name))
        %timeit func(*arrays)

def test_all(*in_arrays):
    test(in_arrays, name_func)

# `cartesian_product_recursive` throws an 
# unexpected error when used on more than
# two input arrays, so for now I've removed
# it from these tests.

def test_cartesian(*in_arrays):
    test(in_arrays, name_func[2:4] + name_func[-1:])

x10 = [numpy.arange(10)]
x50 = [numpy.arange(50)]
x100 = [numpy.arange(100)]
x500 = [numpy.arange(500)]
x1000 = [numpy.arange(1000)]

测试结果:

In [2]: test_all(*(x100 * 2))
repeat_product:
67.5 µs ± 633 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
dstack_product:
67.7 µs ± 1.09 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
cartesian_product:
33.4 µs ± 558 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
cartesian_product_transpose:
67.7 µs ± 932 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
cartesian_product_recursive:
215 µs ± 6.01 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
cartesian_product_itertools:
3.65 ms ± 38.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [3]: test_all(*(x500 * 2))
repeat_product:
1.31 ms ± 9.28 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
dstack_product:
1.27 ms ± 7.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
cartesian_product:
375 µs ± 4.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
cartesian_product_transpose:
488 µs ± 8.88 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
cartesian_product_recursive:
2.21 ms ± 38.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
cartesian_product_itertools:
105 ms ± 1.17 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [4]: test_all(*(x1000 * 2))
repeat_product:
10.2 ms ± 132 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
dstack_product:
12 ms ± 120 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
cartesian_product:
4.75 ms ± 57.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
cartesian_product_transpose:
7.76 ms ± 52.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
cartesian_product_recursive:
13 ms ± 209 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
cartesian_product_itertools:
422 ms ± 7.77 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

在所有情况下,此答案开头定义的cartesian_product最快。

对于那些接受任意数量的输入数组的函数,在len(arrays) > 2时也值得检查性能。 (直到我可以确定Nico Schlömer在这种情况下引发错误的原因,我已将其从这些测试中删除。)

In [5]: test_cartesian(*(x100 * 3))
cartesian_product:
8.8 ms ± 138 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
cartesian_product_transpose:
7.87 ms ± 91.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
cartesian_product_itertools:
518 ms ± 5.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [6]: test_cartesian(*(x50 * 4))
cartesian_product:
169 ms ± 5.1 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
cartesian_product_transpose:
184 ms ± 4.32 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
cartesian_product_itertools:
3.69 s ± 73.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [7]: test_cartesian(*(x10 * 6))
cartesian_product:
26.5 ms ± 449 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
cartesian_product_transpose:
16 ms ± 133 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
cartesian_product_itertools:
728 ms ± 16 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [8]: test_cartesian(*(x10 * 7))
cartesian_product:
650 ms ± 8.14 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
cartesian_product_transpose:
518 ms ± 7.09 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
cartesian_product_itertools:
8.13 s ± 122 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

正如这些测试显示的那样,cartesian_product保持竞争力,直到输入数组的数量超过(大约)四。在那之后,cartesian_product_transpose确实有一点优势。

值得重申的是,使用其他硬件和操作系统的用户可能会看到不同的结果。例如,unutbu报告使用Ubuntu 14.04,Python 3.4.3和numpy 1.14.0.dev0 + b7050a9查看以下测试结果:

>>> %timeit cartesian_product_transpose(x500, y500) 
1000 loops, best of 3: 682 µs per loop
>>> %timeit cartesian_product(x500, y500)
1000 loops, best of 3: 1.55 ms per loop

下面,我将详细介绍我之前按照这些方针进行的测试。对于不同的硬件和不同版本的Python和numpy,这些方法的相对性能随着时间的推移而发生了变化。虽然对于使用最新版numpy的人来说它并不是立即有用,但它说明了自此答案的第一个版本以来情况发生了怎样的变化。

一个简单的替代方案:meshgrid + dstack

当前接受的答案使用tilerepeat一起广播两个阵列。但meshgrid函数几乎完全相同。这是tilerepeat在传递给转置之前的输出:

In [1]: import numpy
In [2]: x = numpy.array([1,2,3])
   ...: y = numpy.array([4,5])
   ...: 

In [3]: [numpy.tile(x, len(y)), numpy.repeat(y, len(x))]
Out[3]: [array([1, 2, 3, 1, 2, 3]), array([4, 4, 4, 5, 5, 5])]

这是meshgrid的输出:

In [4]: numpy.meshgrid(x, y)
Out[4]: 
[array([[1, 2, 3],
        [1, 2, 3]]), array([[4, 4, 4],
        [5, 5, 5]])]

正如您所看到的,它几乎相同。我们只需要重塑结果以获得完全相同的结果。

In [5]: xt, xr = numpy.meshgrid(x, y)
   ...: [xt.ravel(), xr.ravel()]
Out[5]: [array([1, 2, 3, 1, 2, 3]), array([4, 4, 4, 5, 5, 5])]

但是,不是在此时重新塑造,我们可以将meshgrid的输出传递给dstack并在之后重塑,这可以节省一些工作:

In [6]: numpy.dstack(numpy.meshgrid(x, y)).reshape(-1, 2)
Out[6]: 
array([[1, 4],
       [2, 4],
       [3, 4],
       [1, 5],
       [2, 5],
       [3, 5]])

cartesian_product_recursive中的说法相反,我没有看到任何证据表明不同的输入会产生不同形状的输出,如上所示,它们做的事情非常相似,所以如果他们做到了。如果您找到反例,请告诉我。

测试meshgrid + dstackrepeat + transpose

这两种方法的相对表现随着时间的推移而发生了变化。在早期版本的Python(2.7)中,对于小输入,使用meshgrid + dstack的结果明显更快。 (请注意,这些测试来自此答案的旧版本。)定义:

>>> def repeat_product(x, y):
...     return numpy.transpose([numpy.tile(x, len(y)), 
                                numpy.repeat(y, len(x))])
...
>>> def dstack_product(x, y):
...     return numpy.dstack(numpy.meshgrid(x, y)).reshape(-1, 2)
...     

对于中等大小的输入,我看到了显着的加速。但是我在更新的机器上使用更新版本的Python(3.6.1)和numpy(1.12.1)重试了这些测试。这两种方法现在几乎完全相同。

旧测试

>>> x, y = numpy.arange(500), numpy.arange(500)
>>> %timeit repeat_product(x, y)
10 loops, best of 3: 62 ms per loop
>>> %timeit dstack_product(x, y)
100 loops, best of 3: 12.2 ms per loop

新测试

In [7]: x, y = numpy.arange(500), numpy.arange(500)
In [8]: %timeit repeat_product(x, y)
1.32 ms ± 24.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [9]: %timeit dstack_product(x, y)
1.26 ms ± 8.47 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

与往常一样,YMMV,但这表明在最新版本的Python和numpy中,这些是可以互换的。

广义产品功能

一般情况下,我们可能会认为使用内置函数对于小输入会更快,而对于大输入,专用函数可能会更快。此外,对于广义的n维产品,tilerepeat不会有帮助,因为它们没有明确的高维类似物。因此,值得调查特定功能的行为。

大多数相关测试都出现在本答案的开头,但是这里有一些在早期版本的Python和numpy上进行的测试用于比较。

this comment中定义的cartesian函数用于较大的输入。 (它与上面名为cartesian_product_recursive的函数相同。)为了将cartesiandstack_prodct进行比较,我们只使用了两个维度。

再次,旧测试显示出显着差异,而新测试几乎没有。

旧测试

>>> x, y = numpy.arange(1000), numpy.arange(1000)
>>> %timeit cartesian([x, y])
10 loops, best of 3: 25.4 ms per loop
>>> %timeit dstack_product(x, y)
10 loops, best of 3: 66.6 ms per loop

新测试

In [10]: x, y = numpy.arange(1000), numpy.arange(1000)
In [11]: %timeit cartesian([x, y])
12.1 ms ± 199 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [12]: %timeit dstack_product(x, y)
12.7 ms ± 334 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

与以前一样,dstack_product仍以较小的比例击败cartesian

新测试冗余旧测试未显示

In [13]: x, y = numpy.arange(100), numpy.arange(100)
In [14]: %timeit cartesian([x, y])
215 µs ± 4.75 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [15]: %timeit dstack_product(x, y)
65.7 µs ± 1.15 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

我认为这些区别很有意思且值得记录;但他们最终是学术性的。正如本答案开头的测试所示,所有这些版本几乎总是慢于cartesian_product,这个问题在本答案的最开始时定义 - 这本身比答案中的最快实现慢一点。这个问题。

答案 1 :(得分:69)

>>> numpy.transpose([numpy.tile(x, len(y)), numpy.repeat(y, len(x))])
array([[1, 4],
       [2, 4],
       [3, 4],
       [1, 5],
       [2, 5],
       [3, 5]])

有关计算N个阵列的笛卡尔积的一般解决方案,请参阅Using numpy to build an array of all combinations of two arrays

答案 2 :(得分:35)

你可以在python

中进行正常的列表理解
x = numpy.array([1,2,3])
y = numpy.array([4,5])
[[x0, y0] for x0 in x for y0 in y]

应该给你

[[1, 4], [1, 5], [2, 4], [2, 5], [3, 4], [3, 5]]

答案 3 :(得分:22)

我对此感兴趣,并进行了一些性能比较,或许比@ senderle的回答更清晰。

对于两个数组(经典案例):

enter image description here

对于四个阵列:

enter image description here

(请注意,数组的长度在这里只有几十个条目。)

重现图表的代码:

from functools import reduce
import itertools
import numpy
import perfplot


def dstack_product(arrays):
    return numpy.dstack(
        numpy.meshgrid(*arrays, indexing='ij')
        ).reshape(-1, len(arrays))


# Generalized N-dimensional products
def cartesian_product(arrays):
    la = len(arrays)
    dtype = numpy.find_common_type([a.dtype for a in arrays], [])
    arr = numpy.empty([len(a) for a in arrays] + [la], dtype=dtype)
    for i, a in enumerate(numpy.ix_(*arrays)):
        arr[..., i] = a
    return arr.reshape(-1, la)


def cartesian_product_transpose(arrays):
    broadcastable = numpy.ix_(*arrays)
    broadcasted = numpy.broadcast_arrays(*broadcastable)
    rows, cols = reduce(numpy.multiply, broadcasted[0].shape), len(broadcasted)
    dtype = numpy.find_common_type([a.dtype for a in arrays], [])

    out = numpy.empty(rows * cols, dtype=dtype)
    start, end = 0, rows
    for a in broadcasted:
        out[start:end] = a.reshape(-1)
        start, end = end, end + rows
    return out.reshape(cols, rows).T


# from https://stackoverflow.com/a/1235363/577088
def cartesian_product_recursive(arrays, out=None):
    arrays = [numpy.asarray(x) for x in arrays]
    dtype = arrays[0].dtype

    n = numpy.prod([x.size for x in arrays])
    if out is None:
        out = numpy.zeros([n, len(arrays)], dtype=dtype)

    m = n // arrays[0].size
    out[:, 0] = numpy.repeat(arrays[0], m)
    if arrays[1:]:
        cartesian_product_recursive(arrays[1:], out=out[0:m, 1:])
        for j in range(1, arrays[0].size):
            out[j*m:(j+1)*m, 1:] = out[0:m, 1:]
    return out


def cartesian_product_itertools(arrays):
    return numpy.array(list(itertools.product(*arrays)))


perfplot.show(
    setup=lambda n: 4*(numpy.arange(n, dtype=float),),
    n_range=[2**k for k in range(6)],
    kernels=[
        dstack_product,
        cartesian_product,
        cartesian_product_transpose,
        cartesian_product_recursive,
        cartesian_product_itertools
        ],
    logx=True,
    logy=True,
    xlabel='len(a), len(b)',
    equality_check=None
    )

答案 4 :(得分:13)

在@ senderle的示范性基础工作的基础上,我提出了两个版本 - 一个用于C,一个用于Fortran布局 - 通常更快一些。

  • cartesian_product_transpose_pp是 - 不像@ senderle的cartesian_product_transpose完全使用不同的策略 - cartesion_product版本使用更有利的转置内存布局+一些非常小的优化。
  • cartesian_product_pp坚持使用原始内存布局。它的快速之处在于它使用连续复制。连续拷贝变得非常快,以至于复制整个内存块,即使只有部分内存包含有效数据,也比仅复制有效位更好。

一些穿孔。我为C和Fortran布局分别制作了,因为这些是IMO的不同任务。

以'pp'结尾的名字是我的方法。

1)许多微小因素(每个2个元素)

enter image description here enter image description here

2)许多小因素(每个4个元素)

enter image description here enter image description here

3)三个等长的因素

enter image description here enter image description here

4)两个等长的因素

enter image description here enter image description here

代码(需要为每个绘图执行单独的运行b / c我无法弄清楚如何重置;还需要适当地编辑/注释):

import numpy
import numpy as np
from functools import reduce
import itertools
import timeit
import perfplot

def dstack_product(arrays):
    return numpy.dstack(
        numpy.meshgrid(*arrays, indexing='ij')
        ).reshape(-1, len(arrays))

def cartesian_product_transpose_pp(arrays):
    la = len(arrays)
    dtype = numpy.result_type(*arrays)
    arr = numpy.empty((la, *map(len, arrays)), dtype=dtype)
    idx = slice(None), *itertools.repeat(None, la)
    for i, a in enumerate(arrays):
        arr[i, ...] = a[idx[:la-i]]
    return arr.reshape(la, -1).T

def cartesian_product(arrays):
    la = len(arrays)
    dtype = numpy.result_type(*arrays)
    arr = numpy.empty([len(a) for a in arrays] + [la], dtype=dtype)
    for i, a in enumerate(numpy.ix_(*arrays)):
        arr[...,i] = a
    return arr.reshape(-1, la)

def cartesian_product_transpose(arrays):
    broadcastable = numpy.ix_(*arrays)
    broadcasted = numpy.broadcast_arrays(*broadcastable)
    rows, cols = numpy.prod(broadcasted[0].shape), len(broadcasted)
    dtype = numpy.result_type(*arrays)

    out = numpy.empty(rows * cols, dtype=dtype)
    start, end = 0, rows
    for a in broadcasted:
        out[start:end] = a.reshape(-1)
        start, end = end, end + rows
    return out.reshape(cols, rows).T

from itertools import accumulate, repeat, chain

def cartesian_product_pp(arrays, out=None):
    la = len(arrays)
    L = *map(len, arrays), la
    dtype = numpy.result_type(*arrays)
    arr = numpy.empty(L, dtype=dtype)
    arrs = *accumulate(chain((arr,), repeat(0, la-1)), np.ndarray.__getitem__),
    idx = slice(None), *itertools.repeat(None, la-1)
    for i in range(la-1, 0, -1):
        arrs[i][..., i] = arrays[i][idx[:la-i]]
        arrs[i-1][1:] = arrs[i]
    arr[..., 0] = arrays[0][idx]
    return arr.reshape(-1, la)

def cartesian_product_itertools(arrays):
    return numpy.array(list(itertools.product(*arrays)))


# from https://stackoverflow.com/a/1235363/577088
def cartesian_product_recursive(arrays, out=None):
    arrays = [numpy.asarray(x) for x in arrays]
    dtype = arrays[0].dtype

    n = numpy.prod([x.size for x in arrays])
    if out is None:
        out = numpy.zeros([n, len(arrays)], dtype=dtype)

    m = n // arrays[0].size
    out[:, 0] = numpy.repeat(arrays[0], m)
    if arrays[1:]:
        cartesian_product_recursive(arrays[1:], out=out[0:m, 1:])
        for j in range(1, arrays[0].size):
            out[j*m:(j+1)*m, 1:] = out[0:m, 1:]
    return out

### Test code ###
if False:
  perfplot.save('cp_4el_high.png',
    setup=lambda n: n*(numpy.arange(4, dtype=float),),
                n_range=list(range(6, 11)),
    kernels=[
        dstack_product,
        cartesian_product_recursive,
        cartesian_product,
#        cartesian_product_transpose,
        cartesian_product_pp,
#        cartesian_product_transpose_pp,
        ],
    logx=False,
    logy=True,
    xlabel='#factors',
    equality_check=None
    )
else:
  perfplot.save('cp_2f_T.png',
    setup=lambda n: 2*(numpy.arange(n, dtype=float),),
    n_range=[2**k for k in range(5, 11)],
    kernels=[
#        dstack_product,
#        cartesian_product_recursive,
#        cartesian_product,
        cartesian_product_transpose,
#        cartesian_product_pp,
        cartesian_product_transpose_pp,
        ],
    logx=True,
    logy=True,
    xlabel='length of each factor',
    equality_check=None
    )

答案 5 :(得分:9)

截至2017年10月,numpy现在有一个带有轴参数的通用np.stack函数。使用它,我们可以使用“dstack和meshgrid”技术获得“广义笛卡尔积”:

import numpy as np
def cartesian_product(*arrays):
    ndim = len(arrays)
    return np.stack(np.meshgrid(*arrays), axis=-1).reshape(-1, ndim)

关于axis=-1参数的注释。这是结果中的最后一个(最内部)轴。它相当于使用axis=ndim

另外一条评论,因为笛卡儿产品爆炸很快,除非我们需要在内存中实现阵列由于某种原因,如果产品非常大,我们可能想要使用{ {1}}并在运行中使用这些值。

答案 6 :(得分:8)

我使用了@kennytm answer一段时间,但是当尝试在TensorFlow中做同样的事情时,我发现TensorFlow没有等同于SELECT ts.snapshot_date, AVG(ts.sales) OVER (ORDER BY ts.snapshot_date) AS sales_avg, ROW_NUMBER() OVER (ORDER BY ts.snapshot_date) AS COUNT FROM time_series ts; 。经过一些实验,我认为我找到了一个更通用的解决方案,用于任意点矢量。

对于numpy:

numpy.repeat()

和TensorFlow:

import numpy as np

def cartesian_product(*args: np.ndarray) -> np.ndarray:
    """
    Produce the cartesian product of arbitrary length vectors.

    Parameters
    ----------
    np.ndarray args
        vector of points of interest in each dimension

    Returns
    -------
    np.ndarray
        the cartesian product of size [m x n] wherein:
            m = prod([len(a) for a in args])
            n = len(args)
    """
    for i, a in enumerate(args):
        assert a.ndim == 1, "arg {:d} is not rank 1".format(i)
    return np.concatenate([np.reshape(xi, [-1, 1]) for xi in np.meshgrid(*args)], axis=1)

答案 7 :(得分:5)

Scikit-learn包具有这样的快速实现:

from sklearn.utils.extmath import cartesian
product = cartesian((x,y))

请注意,如果您关心输出的顺序,此实现的约定与您想要的不同。对于您的确切订购,您可以

product = cartesian((y,x))[:, ::-1]

答案 8 :(得分:4)

更一般地说,如果你有两个2d numpy数组a和b,并且你想要将a的每一行连接到b的每一行(行的笛卡尔积,有点像数据库中的连接),你可以使用这种方法:

var request = require('request');
module.exports = function (location) {
  return new Promise(function (resolve, reject) {
    var encodedLocation = encodeURIComponent(location);
    var url = 'http://api.openweathermap.org/data/2.5/weather?q=' + encodedLocation + '&units=imperial&appid=2de143494c0b295cca9337e1e96b00e0';
    if (!location) {
      return reject('No location provided');
    }
    request({url: url, json: true}, function (error, response, body) {
      if (error) {
        return reject('Unable to fetch weather.');
      }
      if(body.main){
        return resolve('The current temperature in ' + body.name + ' is ' + body.main.temp + ' degrees farenheit with humidity at ' + body.main.humidity + '%.');
      }
      return reject(body.message)
    });
  });
};

答案 9 :(得分:3)

您可以通过将生成器表达式与map函数组合来获得最快的速度:

import numpy
import datetime
a = np.arange(1000)
b = np.arange(200)

start = datetime.datetime.now()

foo = (item for sublist in [list(map(lambda x: (x,i),a)) for i in b] for item in sublist)

print (list(foo))

print ('execution time: {} s'.format((datetime.datetime.now() - start).total_seconds()))

输出(实际上打印整个结果列表):

[(0, 0), (1, 0), ...,(998, 199), (999, 199)]
execution time: 1.253567 s

或使用双生成器表达式:

a = np.arange(1000)
b = np.arange(200)

start = datetime.datetime.now()

foo = ((x,y) for x in a for y in b)

print (list(foo))

print ('execution time: {} s'.format((datetime.datetime.now() - start).total_seconds()))

输出(打印整个列表):

[(0, 0), (1, 0), ...,(998, 199), (999, 199)]
execution time: 1.187415 s

考虑到大部分计算时间都会进入打印命令。发电机计算在其他方面效率相当高。没有打印,计算时间是:

execution time: 0.079208 s

用于生成器表达式+ map函数和:

execution time: 0.007093 s

表示双生成器表达式。

如果您真正想要的是计算每个坐标对的实际乘积,最快的是将其解算为numpy矩阵乘积:

a = np.arange(1000)
b = np.arange(200)

start = datetime.datetime.now()

foo = np.dot(np.asmatrix([[i,0] for i in a]), np.asmatrix([[i,0] for i in b]).T)

print (foo)

print ('execution time: {} s'.format((datetime.datetime.now() - start).total_seconds()))

输出:

 [[     0      0      0 ...,      0      0      0]
 [     0      1      2 ...,    197    198    199]
 [     0      2      4 ...,    394    396    398]
 ..., 
 [     0    997   1994 ..., 196409 197406 198403]
 [     0    998   1996 ..., 196606 197604 198602]
 [     0    999   1998 ..., 196803 197802 198801]]
execution time: 0.003869 s

并且没有打印(在这种情况下它不会节省太多,因为实际上只打印出一小块矩阵):

execution time: 0.003083 s

答案 10 :(得分:1)

我参加聚会有点晚了,但是我迷惑了这个问题的一个棘手的变体。 假设我想要多个数组的笛卡尔积,但是笛卡尔积最终要比计算机的内存大得多(但是,用该乘积进行的计算是快速的,或者至少是可并行化的)。

显而易见的解决方案是将笛卡尔乘积分成多个块,然后一个接一个地对待这些块(以一种“流式”的方式)。您可以使用itertools.product轻松地做到这一点,但是速度太慢了。同样,这里提出的解决方案(无论它们快得不能)都没有给我们这种可能性。我提出的解决方案使用Numba,并且比此处提到的“规范” cartesian_product稍快。这是很长的时间,因为我试图尽我所能对其进行优化。

import numba as nb
import numpy as np
from typing import List


@nb.njit(nb.types.Tuple((nb.int32[:, :],
                         nb.int32[:]))(nb.int32[:],
                                       nb.int32[:],
                                       nb.int64, nb.int64))
def cproduct(sizes: np.ndarray, current_tuple: np.ndarray, start_idx: int, end_idx: int):
    """Generates ids tuples from start_id to end_id"""
    assert len(sizes) >= 2
    assert start_idx < end_idx

    tuples = np.zeros((end_idx - start_idx, len(sizes)), dtype=np.int32)
    tuple_idx = 0
    # stores the current combination
    current_tuple = current_tuple.copy()
    while tuple_idx < end_idx - start_idx:
        tuples[tuple_idx] = current_tuple
        current_tuple[0] += 1
        # using a condition here instead of including this in the inner loop
        # to gain a bit of speed: this is going to be tested each iteration,
        # and starting a loop to have it end right away is a bit silly
        if current_tuple[0] == sizes[0]:
            # the reset to 0 and subsequent increment amount to carrying
            # the number to the higher "power"
            current_tuple[0] = 0
            current_tuple[1] += 1
            for i in range(1, len(sizes) - 1):
                if current_tuple[i] == sizes[i]:
                    # same as before, but in a loop, since this is going
                    # to get called less often
                    current_tuple[i + 1] += 1
                    current_tuple[i] = 0
                else:
                    break
        tuple_idx += 1
    return tuples, current_tuple


def chunked_cartesian_product_ids(sizes: List[int], chunk_size: int):
    """Just generates chunks of the cartesian product of the ids of each
    input arrays (thus, we just need their sizes here, not the actual arrays)"""
    prod = np.prod(sizes)

    # putting the largest number at the front to more efficiently make use
    # of the cproduct numba function
    sizes = np.array(sizes, dtype=np.int32)
    sorted_idx = np.argsort(sizes)[::-1]
    sizes = sizes[sorted_idx]
    if chunk_size > prod:
        chunk_bounds = (np.array([0, prod])).astype(np.int64)
    else:
        num_chunks = np.maximum(np.ceil(prod / chunk_size), 2).astype(np.int32)
        chunk_bounds = (np.arange(num_chunks + 1) * chunk_size).astype(np.int64)
        chunk_bounds[-1] = prod
    current_tuple = np.zeros(len(sizes), dtype=np.int32)
    for start_idx, end_idx in zip(chunk_bounds[:-1], chunk_bounds[1:]):
        tuples, current_tuple = cproduct(sizes, current_tuple, start_idx, end_idx)
        # re-arrange columns to match the original order of the sizes list
        # before yielding
        yield tuples[:, np.argsort(sorted_idx)]


def chunked_cartesian_product(*arrays, chunk_size=2 ** 25):
    """Returns chunks of the full cartesian product, with arrays of shape
    (chunk_size, n_arrays). The last chunk will obviously have the size of the
    remainder"""
    array_lengths = [len(array) for array in arrays]
    for array_ids_chunk in chunked_cartesian_product_ids(array_lengths, chunk_size):
        slices_lists = [arrays[i][array_ids_chunk[:, i]] for i in range(len(arrays))]
        yield np.vstack(slices_lists).swapaxes(0,1)


def cartesian_product(*arrays):
    """Actual cartesian product, not chunked, still fast"""
    total_prod = np.prod([len(array) for array in arrays])
    return next(chunked_cartesian_product(*arrays, total_prod))


a = np.arange(0, 3)
b = np.arange(8, 10)
c = np.arange(13, 16)
for cartesian_tuples in chunked_cartesian_product(*[a, b, c], chunk_size=5):
    print(cartesian_tuples)

这将以5个3像素的块输出我们的笛卡尔积:

[[ 0  8 13]
 [ 0  8 14]
 [ 0  8 15]
 [ 1  8 13]
 [ 1  8 14]]
[[ 1  8 15]
 [ 2  8 13]
 [ 2  8 14]
 [ 2  8 15]
 [ 0  9 13]]
[[ 0  9 14]
 [ 0  9 15]
 [ 1  9 13]
 [ 1  9 14]
 [ 1  9 15]]
[[ 2  9 13]
 [ 2  9 14]
 [ 2  9 15]]

如果您想了解此处所做的工作,则njitted函数背后的直觉是枚举怪异的数字基中的每个“数字”,其元素将由输入数组的大小组成(而不是常规二进制,十进制或十六进制基数相同的数字)。

显然,此解决方案对于大型产品很有趣。对于小型的,开销可能会有些昂贵。

注意:由于numba仍在大量开发中,因此我正在使用numba 0.50和python 3.6来运行它。

答案 11 :(得分:0)

使用itertools.product方法

也可以轻松完成
from itertools import product
import numpy as np

x = np.array([1, 2, 3])
y = np.array([4, 5])
cart_prod = np.array(list(product(*[x, y])),dtype='int32')

结果:    数组([
[1,4],
       [1,5],
       [2,4],
       [2,5],
       [3,4],
       [3,5]],dtype = int32)

执行时间:0.000155 s

答案 12 :(得分:0)

在特定情况下,您需要执行简单的操作,例如在每对上进行加法,则可以引入一个额外的维度,然后让广播来完成这项工作:

>>> a, b = np.array([1,2,3]), np.array([10,20,30])
>>> a[None,:] + b[:,None]
array([[11, 12, 13],
       [21, 22, 23],
       [31, 32, 33]])

我不确定是否有任何类似的方法可以真正地获得配对。