Question

我有一个文本检测器，它输出检测到的文本的多边形坐标：

我正在使用下面的循环显示带有边界框的检测文本：

for i in range(0, num_box):
    pts = np.array(boxes[0][i],np.int32)
    pts = pts.reshape((-1,1,2))
    print(pts)
    print('\n')
    img2 = cv2.polylines(img,[pts],True,(0,255,0),2)
return img2

每个pts存储多边形的所有坐标，以检测一个文本框：

pts = 

[[[509 457]]

 [[555 457]]

 [[555 475]]

 [[509 475]]]

我想使用以下方法将pts描述的边界框内的区域转换为灰度：

gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

但是我不确定如何在上方image中提供gray_image参数，因为我只想将pts描述的区域转换为灰度而不是整个图像（{ {1}}）。我希望图像的其余部分为白色。

Answer 1

根据我的理解，您要将边框的 content 转换为灰度，并将图像的其余部分设置为白色（背景）。

这是我要实现的解决方案：

import cv2
import numpy as np

# Some input image
image = cv2.imread('path/to/your/image.png')

# Some pts 
pts = np.array([[60, 40], [340, 40], [340, 120], [60, 120]])

# Get extreme x, y coordinates from box
x1 = pts[0][0]
y1 = pts[0][1]
x2 = pts[1][0]
y2 = pts[2][1]

# Build output; initialize white background
image2 = 255 * np.ones(image.shape, np.uint8)
image2[y1:y2, x1:x2] = cv2.cvtColor(cv2.cvtColor(image[y1:y2, x1:x2], cv2.COLOR_BGR2GRAY), cv2.COLOR_GRAY2BGR)

# Show bounding box in original image
cv2.polylines(image, [pts], True, (0, 255, 0), 2)

cv2.imshow('image', image)
cv2.imshow('image2', image2)
cv2.waitKey(0)
cv2.destroyAllWindows()

主要的“技巧”是仅在图像的感兴趣区域（ROI）上使用OpenCV的cvtColor方法两次，第一次将BGR转换为灰度，然后将灰度转换回BGR。通过适当的NumPy array indexing and slicing可以访问“ Python OpenCV图像”中的矩形ROI。大多数OpenCV函数（Python API）支持仅在这些ROI上进行操作。

编辑：如果最终图像是纯灰度图像，则可以省略向后转换！

这些是我用“标准图片”生成的一些输出：

希望有帮助！

裁剪并将多边形转换为灰度

1 个答案: