Question

给定一个矩阵，例如：

[[2 5 3 8 3]
 [1 4 6 8 4]
 [3 6 7 9 5]
 [1 3 6 4 2]
 [2 6 4 3 1]]

...如何找到最大的子矩阵（即大多数值），其中所有行都已排序，所有列都已排序？

在上面的例子中，解决方案是（1,0） - （2,3）的子矩阵：

  1 4 6 8     
  3 6 7 9

，其大小为8。

Answer 1

您可以使用递归来获得可以适合给定行段以下的最大化区域，该区域本身已经被验证为非递减值序列。找到的区域将保证保持在给定行段的列范围内，但可能更窄并且跨越给定行下面的几行。

然后，返回的区域可以向上扩展一行，该区域已经具有该区域的宽度。如果段不能更宽，那么我们将找到可以从该段（或整段）的子序列与其下面的行组合的最大区域。

通过从为所有行中的所有段检索的结果中过滤最佳结果，我们将找到解决方案。

为避免重复已经针对完全相同的段进行的递归计算，可以使用记忆（直接编程）。

以下是建议的代码：

from collections import namedtuple
Area = namedtuple('Area', 'start_row_num start_col_num end_row_num end_col_num size')
EMPTY_AREA = Area(0,0,0,0,0)

def greatest_sub(matrix):
    memo = {}

    # Function that will be called recursively
    def greatest_downward_extension(row_num, start_col_num, end_col_num, depth=0):
        # Exit if the described segment has no width
        if end_col_num <= start_col_num:
            return EMPTY_AREA
        next_row_num = row_num + 1
        # Use memoisation:
        #   Derive an ID (hash) from the segment's attributes for use as memoisation key
        segment_id = ((row_num * len(matrix[0]) + start_col_num) 
                      * len(matrix[0]) + end_col_num)
        if segment_id in memo:
            return memo[segment_id]
        # This segment without additional rows is currently the best we have:
        best = Area(row_num, start_col_num, next_row_num, end_col_num,
                    end_col_num - start_col_num)
        if next_row_num >= len(matrix):
            return best
        next_row = matrix[next_row_num]
        row = matrix[row_num]
        prev_val = -float('inf')
        for col_num in range(start_col_num, end_col_num + 1):
            # Detect interruption in increasing series, 
            #    either vertically (1) or horizontally (0)
            status = (1 if col_num >= end_col_num or next_row[col_num] < row[col_num]
                    else (0 if next_row[col_num] < prev_val
                    else -1))
            if status >= 0: # There is an interruption: stop segment
                # Find largest area below current row segment, within its column range
                result = greatest_downward_extension(next_row_num,
                                                     start_col_num, col_num)
                # Get column range of found area and add that range from the current row
                size = result.size + result.end_col_num - result.start_col_num
                if size > best.size:
                    best = Area(row_num, result.start_col_num, 
                                result.end_row_num, result.end_col_num, size)
                if col_num >= end_col_num:
                    break
                # When the interruption was vertical, the next segment can only start
                #    at the next column (status == 1)
                start_col_num = col_num + status
            prev_val = row[col_num]
        memo[segment_id] = best
        return best

    # For each row identify the segments with non-decreasing values
    best = EMPTY_AREA
    for row_num, row in enumerate(matrix):
        prev_val = -float('inf')
        start_col_num = 0
        for end_col_num in range(start_col_num, len(row) + 1):
            # When value decreases (or we reached the end of the row), 
            #   the segment ends here
            if end_col_num >= len(row) or row[end_col_num] < prev_val:
                # Find largest area below current row segment, within its column range
                result = greatest_downward_extension(row_num, start_col_num, end_col_num)
                if result.size > best.size:
                    best = result
                if end_col_num >= len(row):
                    break
                start_col_num = end_col_num
            prev_val = row[end_col_num]
    return best

# Sample call    
matrix = [
    [2, 5, 3, 8, 3],
    [1, 4, 6, 8, 4],
    [3, 6, 7, 9, 5],
    [1, 3, 6, 4, 2],
    [2, 6, 4, 3, 1]]

result = greatest_sub(matrix)
print(result)

样本数据的输出将是：

Area(start_row_num=1, start_col_num=0, end_row_num=3, end_col_num=4, size=8)

Answer 2

一种方法，听起来你已经尝试过，将使用强力递归来检查整个矩阵，然后逐个区域越来越小，直到找到一个有效的方法。这听起来像你已经尝试过了，但你可能会得到不同的结果，这取决于你是从最小到最大的部分检查（在这种情况下你必须检查每个组合，无论如何）或从大到小（在这种情况下，你仍然会最终检查了大量的案例）。

另一种方法是创建两个与原始尺寸相同的矩阵，其中矩阵中的每个槽代表两个数字之间的间隙，每行或每列中的第一个槽代表所述行中第一个数字之上的间隙或列。你可以用1和0填充第一个矩阵来表示矩阵是否可以垂直形成（1个表示间隙意味着低于间隙的数字将大于间隙之上的数字）和第二个矩阵用1或0表示水平相似的条件。对于矩阵中的每个值，你可以使用AND（a，b）（换句话说，只有1 1映射到1的二元运算）来制作一个基本上是AND（matrix1，matrix2）的矩阵，然后你就可以找到矩阵中最大的矩形。

示例矩阵（为简单和方便而小）：

[ 1 2 5 ]
[ 4 9 2 ]
[ 3 6 4 ]

垂直矩阵：位置L中的一个意味着位置L中的数字大于L上方的数字，或者L是列的顶部（括号表示第一行始终适合垂直条件））。

{ 1 1 1 } 
[ 1 1 0 ]
[ 0 0 1 ]

水平矩阵：位置L中的一个意味着位置L中的数字大于L的直接数字，或者L是行的前面（最左边的点）（括号再次表示第一行）永远符合垂直条件。）

{1} [ 1 1 ]
{1} [ 1 0 ]
{1} [ 1 0 ]

垂直和水平（您可以忽略仅垂直和仅水平的步骤并立即执行此操作：对于每个单元格，如果数字大于其右侧或直接位于其下方的数字，则输入0，否则放入1）

[ 1 1 1 ]
[ 1 1 0 ]
[ 0 0 0 ]

最大的矩形将由与原始矩形具有相同索引的最大矩形表示。找到最大的矩形矩阵要容易得多。

希望这有帮助！我知道我没有非常清楚地解释这一点，但总的想法应该是有帮助的。它与您提供的关于比较所有i和i-1数字的想法非常相似。让我知道，如果我为您提供的示例矩阵执行此操作会有所帮助。

如何找到最大的子矩阵，其值按行和列顺序排序？

2 个答案: