Question

对于1D NumPy数组，我希望获得组合中不重复相同元素的组合。订单很重要。因此，[a,b]和[b,a]将是两种截然不同的组合。由于我们不想重复，[a,a]和[b,b]不是有效组合。为简单起见，我们将每个组合保留为两个元素。因此，输出将是具有2D列的2 NumPy数组。

除了我们需要屏蔽掉重复的组合之外，期望的结果与itertools.product输出基本相同。因此，我们可以为样本案例解决它，如此 -

In [510]: import numpy as np

In [511]: a = np.array([4,2,9,1,3])

In [512]: from itertools import product

In [513]: np.array(list(product(a,repeat=2)))[~np.eye(len(a),dtype=bool).ravel()]
Out[513]: 
array([[4, 2],
       [4, 9],
       [4, 1],
       [4, 3],
       [2, 4],
       [2, 9],
       [2, 1],
       [2, 3],
       [9, 4],
       [9, 2],
       [9, 1],
       [9, 3],
       [1, 4],
       [1, 2],
       [1, 9],
       [1, 3],
       [3, 4],
       [3, 2],
       [3, 9],
       [3, 1]])

但是，创建那个巨大的数组，然后屏蔽掉，因此不使用某些元素，对我来说看起来效率不高。

这让我想到numpy.ndarray.strides是否可以在这里使用。我有一个解决方案，考虑到这个想法，我将作为答案发帖，但很想看到其他有效的。

在使用方面 - 我们遇到了邻接矩阵等这些案例，我认为解决这样的问题会很好。为了更容易和有效地插入其他问题，最终输出不是某些中间数组的视图会很好。

Answer 1

似乎可以使用np.lib.stride_tricks.as_strided来最大化views的效率，并且我们将复制延迟到最后阶段，我们将其分配到初始化数组中。实现将分两步进行，第二列需要一些工作（如问题中的示例案例所示），我们称之为one-cold（花式名称表示每个序列缺少一个元素len(input_array) - 1}

的间隔都是/ /很冷

def onecold(a):
    n = len(a)
    s = a.strides[0]
    strided = np.lib.stride_tricks.as_strided
    b = np.concatenate((a,a[:-1]))
    return strided(b[1:], shape=(n-1,n), strides=(s,s))

展示带有示例案例的onecold -

In [563]: a
Out[563]: array([4, 2, 9, 1, 3])

In [564]: onecold(a).reshape(len(a),-1)
Out[564]: 
array([[2, 9, 1, 3],
       [4, 9, 1, 3],
       [4, 2, 1, 3],
       [4, 2, 9, 3],
       [4, 2, 9, 1]])

要解决原始问题，我们将使用它 -

def combinations_without_repeat(a):
    n = len(a)
    out = np.empty((n,n-1,2),dtype=a.dtype)
    out[:,:,0] = np.broadcast_to(a[:,None], (n, n-1))
    out.shape = (n-1,n,2)
    out[:,:,1] = onecold(a)
    out.shape = (-1,2)
    return out

示例运行 -

In [574]: a
Out[574]: array([4, 2, 9, 1, 3])

In [575]: combinations_without_repeat(a)
Out[575]: 
array([[4, 2],
       [4, 9],
       [4, 1],
       [4, 3],
       [2, 4],
       [2, 9],
       [2, 1],
       [2, 3],
       [9, 4],
       [9, 2],
       [9, 1],
       [9, 3],
       [1, 4],
       [1, 2],
       [1, 9],
       [1, 3],
       [3, 4],
       [3, 2],
       [3, 9],
       [3, 1]])

对1000 -

的ints元素数组似乎非常有效

In [578]: a = np.random.randint(0,9,(1000))

In [579]: %timeit combinations_without_repeat(a)
100 loops, best of 3: 2.35 ms per loop

很想见到别人！

Answer 2

“它与itertools.product输出基本相同，期望我们需要屏蔽掉重复的组合。”实际上，你想要的是itertools.permutations：

In [7]: import numpy as np

In [8]: from itertools import permutations

In [9]: a = np.array([4,2,9,1,3])

In [10]: list(permutations(a, 2))
Out[10]: 
[(4, 2),
 (4, 9),
 (4, 1),
 (4, 3),
 (2, 4),
 (2, 9),
 (2, 1),
 (2, 3),
 (9, 4),
 (9, 2),
 (9, 1),
 (9, 3),
 (1, 4),
 (1, 2),
 (1, 9),
 (1, 3),
 (3, 4),
 (3, 2),
 (3, 9),
 (3, 1)]

Answer 3

基准测试

在此wiki-post中发布迄今为止提出的方法的性能数字/数字。

系统配置：

NumPy version         : 1.13.3
Python version        : 2.7.12 (GCC 5.4.0)

Operating System: Ubuntu 16.04
RAM: 16GB
CPU Model: Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz (# Cores=4, # Threads=8)

基准设置将是：

import numpy as np
import perfplot
from itertools import permutations

# https://stackoverflow.com/a/48234170/ @Divakar
def onecold(a):
    n = len(a)
    s = a.strides[0]
    strided = np.lib.stride_tricks.as_strided
    b = np.concatenate((a,a[:-1]))
    return strided(b[1:], shape=(n-1,n), strides=(s,s))

# https://stackoverflow.com/a/48234170/ @Divakar
def combinations_without_repeat(a):
    n = len(a)
    out = np.empty((n,n-1,2),dtype=a.dtype)
    out[:,:,0] = np.broadcast_to(a[:,None], (n, n-1))
    out.shape = (n-1,n,2)
    out[:,:,1] = onecold(a)
    out.shape = (-1,2)
    return out

# https://stackoverflow.com/a/48234349/ @Warren Weckesser
def itertools_permutations(a):
    return np.array(list(permutations(a, 2)))

perfplot.show(
        setup=lambda n: np.random.rand(n),
        n_range=[10,20,50,100,200,500,1000], # dataset sizes
        kernels=[combinations_without_repeat, itertools_permutations],
        logx=True,
        logy=True,
        )

表现数字：

没有重复和排序的组合或数组元素的排列

3 个答案:

基准测试