Question

我遇到了以下代码段：

output = numpy.zeros((max(img1.shape[0], img2.shape[0]), img1.shape[1] + img2.shape[1], img1.shape[2]), dtype=img1.dtype)
output[:img1.shape[0], :img1.shape[1],:] = img1
output[:img2.shape[0]:,img1.shape[1]:img1.shape[1]+img2.shape[1],:] = img2

我能够理解第一行：

output = numpy.zeros((max(img1.shape[0], img2.shape[0]), img1.shape[1] + img2.shape[1], img1.shape[2]), dtype=img1.dtype)

但是，无法解释以下两行的含义：

output[:img1.shape[0], :img1.shape[1],:] = img1
output[:img2.shape[0]:,img1.shape[1]:img1.shape[1]+img2.shape[1],:] = img2

有什么想法吗？

感谢您的支持。

Answer 1

据我了解，img1和img2是包含两个图像的所有像素的矩阵。

假设：

img1 = x x x
       x x x
       x x x

img2 = o o
       o o
       o o
       o o

输出是一个矩阵，其高度在img1和img2之间最高，其宽度是两个宽度的总和。我不知道深度是否相关，但它使用第一张图像的深度（z轴）。然后输出将是：

output = 0 0 0 0 0
         0 0 0 0 0
         0 0 0 0 0
         0 0 0 0 0

第一步是将img1保存在输出上。这样做会占据y轴上从0到img1.height的索引，以及y轴上从0到img1.width的索引。

output[:img1.shape[0], :img1.shape[1],:] = img1

output = x x x 0 0
         x x x 0 0
         x x x 0 0
         0 0 0 0 0

接下来，img2在y轴上从0保存到img2.height，在x轴上从img1.width保存到img1.width + img2.width。

然后：

output[:img2.shape[0]:,img1.shape[1]:img1.shape[1]+img2.shape[1],:] = img2

output = x x x o o
         x x x o o
         x x x o o
         0 0 0 o o

我认为如果两个图像都有关于此轴的信息，这也将在z轴上完成。

Answer 2

它将两个图像堆叠在一起。没有thrid维度的示例：

img1 = np.array([[1, 1], [1, 1]])
img2 = np.array([[2, 2, 2], [2, 2, 2], [2, 2, 2]])
output = np.zeros((max(img1.shape[0], img2.shape[0]), img1.shape[1] + img2.shape[1]), dtype=img1.dtype)
output[:img1.shape[0], :img1.shape[1]] = img1
output[:img2.shape[0]:, img1.shape[1]:img1.shape[1] + img2.shape[1]] = img2
print(output)

输出：

[[1 1 2 2 2]
 [1 1 2 2 2]
 [0 0 2 2 2]]

由于您使用的是numpy，我建议您使用np.hstack或np.vstack。

我们如何解释这个Python代码？

2 个答案: