我是计算机视觉概念的新手,我正在尝试学习。我有一个字母图像,并对图像执行了Otsu binarization,以便图像中的每个实际内容都更改为相同的颜色(本例中为白色255)。现在,我想将图像分成字母。例如,
现在,我想遍历此图像,以将其中的每个单个字符(其中每个字符的子图像)转换为单独的numpy数组或单独的图像,以便可以将其传递给我构建的模型。您能否建议如何实现此目标,或者有任何算法?
我想到了循环,但这似乎很耗时。
答案 0 :(得分:2)
在以下解决方案中,您可以为每个句子分别获取单词。逐字获取后,将逐字输出。
这是完整的代码:
import cv2
import numpy as np
image = cv2.imread("stach.png",0)
cv2.imshow('orig',image)
# image = cv2.resize(image_original,None,fx=4, fy=4, interpolation = cv2.INTER_CUBIC)
#dilation
kernel = np.ones((5,100), np.uint8)
img_dilation = cv2.dilate(image, kernel, iterations=1)
# original_resized = cv2.resize(img_dilation, (0,0), fx=.2, fy=.2)
cv2.imshow('dilated',img_dilation)
cv2.waitKey(0)
#find contours
im2,ctrs, hier = cv2.findContours(img_dilation.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
#sort contours
sorted_ctrs = sorted(ctrs, key=lambda ctr: cv2.boundingRect(ctr)[1])
for i, ctr in enumerate(sorted_ctrs):
# Get bounding box
x, y, w, h = cv2.boundingRect(ctr)
# Getting ROI
roi = image[y:y+h, x:x+w]
# # show ROI
cv2.imshow('segment no:' +str(i),roi)
cv2.waitKey(0)
im = cv2.resize(roi,None,fx=4, fy=4, interpolation = cv2.INTER_CUBIC)
ret_1,thresh_1 = cv2.threshold(im,127,255,cv2.THRESH_BINARY_INV)
# original_resized = cv2.resize(thresh, (0,0), fx=.2, fy=.2)
cv2.imshow('Threshold_1',thresh_1)
cv2.waitKey(0)
cv2.bitwise_not(thresh_1, thresh_1)
kernel = np.ones((5, 30), np.uint8)
words = cv2.dilate(thresh_1, kernel, iterations=1)
cv2.imshow('words', words)
cv2.waitKey(0)
#words=cv2.cvtColor(words, cv2.COLOR_BGR2GRAY);
#find contours
im,ctrs_1, hier = cv2.findContours(words, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
#sort contours
sorted_ctrs_1 = sorted(ctrs_1, key=lambda ctr: cv2.boundingRect(ctr)[0])
for j, ctr_1 in enumerate(sorted_ctrs_1):
# Get bounding box
x_1, y_1, w_1, h_1 = cv2.boundingRect(ctr_1)
# Getting ROI
roi_1 = thresh_1[y_1:y_1+h_1, x_1:x_1+w_1]
# # show ROI
cv2.imshow('Line no: ' + str(i) + " word no : " +str(j),roi_1)
cv2.waitKey(0)
#chars = cv2.cvtColor(roi_1, cv2.COLOR_BGR2GRAY);
# dilation
kernel = np.ones((10, 1), np.uint8)
joined = cv2.dilate(roi_1, kernel, iterations=1)
# original_resized = cv2.resize(img_dilation, (0,0), fx=.2, fy=.2)
cv2.imshow('joined', joined)
cv2.waitKey(0)
# find contours
im, ctrs_2, hier = cv2.findContours(joined, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# sort contours
sorted_ctrs_2 = sorted(ctrs_2, key=lambda ctr: cv2.boundingRect(ctr)[0])
for k, ctr_2 in enumerate(sorted_ctrs_2):
# Get bounding box
x_2, y_2, w_2, h_2 = cv2.boundingRect(ctr_2)
# Getting ROI
roi_2 = roi_1[y_2:y_2 + h_2, x_2:x_2 + w_2]
# # show ROI
cv2.imshow('Line no: ' + str(i) + ' word no : ' + str(j) + ' char no: ' + str(k), roi_2)
cv2.waitKey(0)
应该进行第一行分割。对于该代码,使用以下代码:
kernel = np.ones((5,100), np.uint8)
img_dilation = cv2.dilate(image, kernel, iterations=1)
5x100的内核用于分隔图像中的行。
结果如下:
之后,将从上面的图像中提取轮廓,并将轮廓坐标应用于原始图像。然后将提取图像的线条。示例行如下:
然后,对于每行,将使用与提取行相同的方法,将另一个内核应用于提取单词。
kernel = np.ones((5, 30), np.uint8)
words = cv2.dilate(thresh_1, kernel, iterations=1)
逐字提取后,将使用以下代码逐字提取
:for k, ctr_2 in enumerate(sorted_ctrs_2):
# Get bounding box
x_2, y_2, w_2, h_2 = cv2.boundingRect(ctr_2)
# Getting ROI
roi_2 = roi_1[y_2:y_2 + h_2, x_2:x_2 + w_2]
希望您能理解我提供的方法。您可以根据需要对完整代码进行更改。