Question

我有三个整数的 NumPy数组，列数相同，每行任意数目。我对所有情况都感兴趣，在这种情况下，第一行的一行加上第二行的一行给出了第三行的一行（[3，1，4] + [1，5，9] = [4，6，13 ]）。

这是一个伪代码：

for i, j in rows(array1), rows(array2):
    if i + j is in rows(array3):
        somehow store the rows this occured at (eg. (1,2,5) if 1st row of 
        array1 + 2nd row of array2 give 5th row of array3)

我将需要在非常大的矩阵上运行它，所以我有两个问题：

（1）我可以使用嵌套循环编写上面的内容，但是有没有更快的方法，也许是列表理解或 itertools ？

（2）什么是最快/最省内存的存储三元组的方式？稍后，我将需要创建一个热图，使用两个作为坐标，第一个作为对应值，例如。在伪代码示例中，点（2,5）的值为1。

非常感谢您提供任何提示-我知道这听起来很简单，但需要快速运行，并且我对优化的经验很少。

编辑：在评论中要求提供我的丑陋代码

import numpy as np

#random arrays
A = np.array([[-1,0],[0,-1],[4,1], [-1,2]])
B = np.array([[1,2],[0,3],[3,1]])
C = np.array([[0,2],[2,3]])

#triples stored as numbers with 2 coordinates in a otherwise-zero matrix
output_matrix = np.zeros((B.shape[0], C.shape[0]), dtype = int)
for i in range(A.shape[0]):
    for j in range(B.shape[0]):
        for k in range(C.shape[0]):
            if np.array_equal((A[i,] + B[j,]), C[k,]):
                output_matrix[j, k] = i+1

print(output_matrix)

Answer 1

我们可以利用broadcasting以矢量化的方式执行所有这些求和和比较，然后在其上使用np.where以获取与匹配的索引相对应的索引，最后对索引进行赋值并分配-

output_matrix = np.zeros((B.shape[0], C.shape[0]), dtype = int)

mask = ((A[:,None,None,:] + B[None,:,None,:]) == C).all(-1)

I,J,K = np.where(mask)
output_matrix[J,K] = I+1

Answer 2

（1）改进

您可以在第三矩阵中使用集作为最终结果，因为a + b = c必须保持相同。这已经用恒定时间查找替换了一个嵌套循环。我将在下面向您展示如何执行此操作的示例，但是我们首先应该引入一些符号。

要使用基于集合的方法，我们需要一个可散列的类型。因此，列表将不起作用，但是tuple将起作用：它是一个有序的，不变的结构。但是，存在一个问题：元组添加被定义为追加，即

(0, 1) + (1, 0) = (0, 1, 1, 0).

这对于我们的用例是无效的：我们需要按元素进行加法。因此，我们将内置元组细分为以下类，

class AdditionTuple(tuple):

    def __add__(self, other):
        """
        Element-wise addition.
        """
        if len(self) != len(other):
            raise ValueError("Undefined behaviour!")

        return AdditionTuple(self[idx] + other[idx]
                             for idx in range(len(self)))

我们在其中覆盖__add__的默认行为。现在我们有了适合我们问题的数据类型，让我们准备数据。

您给我们

A = [[-1, 0], [0, -1], [4, 1], [-1, 2]]
B = [[1, 2], [0, 3], [3, 1]]
C = [[0, 2], [2, 3]]

与之合作。我说，

from types import SimpleNamespace

A = [AdditionTuple(item) for item in A]
B = [AdditionTuple(item) for item in B]
C = {tuple(item): SimpleNamespace(idx=idx, values=[])
     for idx, item in enumerate(C)}

也就是说，我们将A和B修改为使用新的数据类型，然后将C变成支持（摊销）O(1)查找的字典时间。

我们现在可以执行以下操作，完全消除一个循环，

from itertools import product

for a, b in product(enumerate(A), enumerate(B)):
    idx_a, a_i = a
    idx_b, b_j = b

    if a_i + b_j in C:  # a_i + b_j == c_k, identically
        C[a_i + b_j].values.append((idx_a, idx_b))

然后

>>>print(C)
{(2, 3): namespace(idx=1, values=[(3, 2)]), (0, 2): namespace(idx=0, values=[(0, 0), (1, 1)])}

对于C中的每个值，您将获得该值的索引（如idx）以及(idx_a, idx_b)的元组列表，其中A的元素和B总计为idx中C的值。

让我们简要分析该算法的复杂性。如上所述，重新定义列表A，B和C在列表的长度上是线性的。 A中的B和O(|A| * |B|)上的迭代当然是在k中，并且嵌套条件计算出元组的元素加法：这在元组本身的长度上是线性的，我们应表示O(k * |A| * |B|)。整个算法然后在O(k * |A| * |B| * |C|)中运行。

这是对您当前的bestModel <- function(k=4L, R2=0.994){ print(k) # here, everything is still fine lmX <- mixlm::lm(getLinearModelFunction(k), data) best <- mixlm::best.subsets(lmX, nbest=1) . . . }算法的重大改进。

（2）矩阵绘图

使用dok_matrix（稀疏的SciPy矩阵表示形式）。然后，您可以在矩阵上使用您喜欢的任何热图绘制库，例如Seaborn's heatmap。

优化测试来自多个NumPy数组的所有行组合

2 个答案:

（1）改进

（2）矩阵绘图