Question

我想找到最大的子矩阵，该子矩阵在矩阵中仅包含负数，例如：

在

[[1,  -9, -2,   8,  6,  1],    
 [8,  -1,-11,  -7,  6,  4],    
 [10, 12, -1,  -9, -12, 14],    
 [8, 10, -3,  -5,  17,  8],    
 [6,  4, 10, -13, -16, 19]]

仅包含负数的最大子矩阵是

[[-11, -7],
 [-1, -9],
 [-3,-5]]

（左上角坐标：1,2，右下角坐标：3,3）。

最有效的方法是什么？

Answer 1

强力解决方案。可以，但是对于更大的矩阵可能认为太慢了：

mOrig = [[1,  -9, -2,   8,  6,  1],
    [8,  -1,-11,  -7,  6,  4],
    [10, 12, -1,  -9, -12, 14],
    [8, 10, -3,  -5,  17,  8],
    [6,  4, 10, -13, -16, 19]]

# reduce the problem
# now we have a matrix that contains only 0 and 1
# at the place where there was a negative number
# there is now a 1 and at the places where a positive
# number had been there is now a 0. 0s are considered
# to be negative numbers, if you want to change this,
# change the x < 0 to x <= 0.
m = [[1 if x < 0 else 0 for x in z] for z in mOrig]

# now we have the problem to find the biggest submatrix
# consisting only 1s.

# first a function that checks if a submatrix only contains 1s
def containsOnly1s(m, x1, y1, x2, y2):
    for i in range(x1, x2):
        for j in range(y1, y2):
            if m[i][j] == 0:
                return False
    return True

def calculateSize(x1, y1, x2, y2):
    return (x2 - x1) * (y2 - y1)

best = (-1, -1, -1, -1, -1)
for x1 in range(len(m)):
    for y1 in range(len(m[0])):
        for x2 in range(x1, len(m)):
            for y2 in range(y1, len(m[0])):
                if containsOnly1s(m, x1, y1, x2, y2):
                    sizeOfSolution = calculateSize(x1, y1, x2, y2)
                    if best[4] < sizeOfSolution:
                        best = (x1, y1, x2, y2, sizeOfSolution)

for x in range(best[0], best[2]):
    print("\t".join([str(mOrig[x][y]) for y in range(best[1], best[3])]))

将输出

-11 -7
-1  -9
-3  -5

如果“最大子矩阵”有其他含义，则唯一需要更改的功能如下：

def calculateSize(x1, y1, x2, y2):
    return (x2 - x1) * (y2 - y1)

正在计算子矩阵的大小。

编辑1 ...首次加速

best = (-1, -1, -1, -1, -1)
for x1 in range(len(m)):
    for y1 in range(len(m[0])):
        if m[x1][y1] == 1: # The starting point must contain a 1
            for x2 in range(x1 + 1, len(m)): # actually can start with x1 + 1 here
                for y2 in range(y1 + 1, len(m[0])):
                    if containsOnly1s(m, x1, y1, x2, y2):
                        sizeOfSolution = calculateSize(x1, y1, x2, y2)
                        if best[4] < sizeOfSolution:
                            best = (x1, y1, x2, y2, sizeOfSolution)
                    else:
                        # There is at least one 0 in the matrix, so every greater
                        # matrix will also contain this 0
                        break

编辑2

好吧，在将矩阵转换为0和1的矩阵之后（就像我通过m = [[1 if x < 0 else 0 for x in z] for z in mOrig]行所做的那样，问题与文献中的the maximal rectangle problem相同。所以我在Google上做了一些搜索关于http://www.drdobbs.com/database/the-maximal-rectangle-problem/184410529的关于此类问题的已知算法的介绍，在此网站上here进行了介绍，该站点描述了一种解决此类问题的快速算法。为了总结本网站的要点，该算法正在利用该结构。可以通过使用堆栈来记住结构轮廓，以使我们可以重新计算宽度，以防万一狭窄的矩形在闭合较宽的矩形时被重用。

Answer 2

这是我使用OpenCV卷积的非常快速的解决方案。需要使用float32，因为它比整数快得多。我的2个核心上的1000 x 1000矩阵需要135毫秒。尽管如此，仍有空间进行进一步的代码优化。

import cv2
import numpy as np

data = """1 -9 -2 8 6 1
8 -1 -11 -7 6 4
10 12 -1 -9 -12 14
8 10 -3 -5 17 8
6 4 10 -13 -16 19"""

# matrix = np.random.randint(-128, 128, (1000, 1000), dtype=np.int32)
matrix = np.int32([line.split() for line in data.splitlines()])

def find_max_kernel(matrix, border=cv2.BORDER_ISOLATED):
    max_area = 0
    mask = np.float32(matrix < 0)
    ones = np.ones_like(mask)
    conv_x = np.zeros_like(mask)
    conv_y = np.zeros_like(mask)
    max_h, max_w = mask.shape
    for h in range(1, max_h + 1):
        cv2.filter2D(mask, -1, ones[:h, None, 0], conv_y, (0, 0), 0, border)
        for w in range(1, max_w + 1):
            area = h * w
            if area > max_area:
                cv2.filter2D(conv_y, -1, ones[None, 0, :w], conv_x, (0, 0), 0, border)
                if conv_x.max() == area:
                    max_area, shape = area, (h, w)
                else:
                    if w == 1:
                        max_h = h - 1
                    if h == 1:
                        max_w = w - 1
                    break
        if h >= max_h:
            break
    cv2.filter2D(mask, -1, np.ones(shape, np.float32), conv_x, (0, 0), 0, border)
    p1 = np.array(np.unravel_index(conv_x.argmax(), conv_x.shape))
    p2 = p1 + shape - 1            
    return p1, p2

print(*find_max_kernel(matrix), sep='\n')

Answer 3

下面是一个函数，对于5000x5000矩阵，它的执行时间不到一秒钟。它仅依赖于基本的np函数。

可以通过返回第一个索引而不是所有索引来改进它。可以进行其他几种优化，但是对于许多用途来说，它足够快。

from numpy import roll, where
def gidx(X):
    Wl = X & roll(X, 1, axis=1)
    T = X & Wl & roll(X, -1, axis=1)
    if T[1:-1][1:-1].any():
        N = T & roll(T, -1, axis=0) & roll(T, 1, axis=0)
        if N.any(): return gidx(N)
    W = Wl & roll(Wl, 1, axis=0)
    if W.any(): return where(W)
    return where(X)

#%% Example
import numpy as np
#np.random.seed(0)
M = 100
X = np.random.randn(M, M) - 2

X0 = (X < 0)
X0[[0, -1]], X0[:, [0, -1]] = False, False
jx, kx = gidx(X0)

在python中找到最大的负子矩阵

3 个答案: