我正在尝试实施OCR应用程序。我想找到一组单词,每个单词,我想找到每个单词的轮廓。我找到了每个单词的轮廓,但是我无法显示每个单词的轮廓。到目前为止我的代码:
imgInput = cv2.imread("inputImage.jpg")
# convert image to grayscale
imgGray = cv2.cvtColor(imgInput, cv2.COLOR_BGR2GRAY)
# invert black and white
newRet, binaryThreshold = cv2.threshold(imgGray,127,255,cv2.THRESH_BINARY_INV)
# dilation
rectkernel = cv2.getStructuringElement(cv2.MORPH_RECT,(15,10))
rectdilation = cv2.dilate(binaryThreshold, rectkernel, iterations = 1)
outputImage = imgInput.copy()
npaContours, npaHierarchy = cv2.findContours(rectdilation.copy(),
cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)
for npaContour in npaContours:
if cv2.contourArea(npaContour) > MIN_CONTOUR_AREA:
[intX, intY, intW, intH] = cv2.boundingRect(npaContour)
cv2.rectangle(outputImage,
(intX, intY), # upper left corner
(intX+intW,intY+intH), # lower right corner
(0, 0, 255), # red
2) # thickness
# Get subimage of word and find contours of that word
imgROI = binaryThreshold[intY:intY+intH, intX:intX+intW]
subContours, subHierarchy = cv2.findContours(imgROI.copy(),
cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)
# This part is not working as I am expecting
for subContour in subContours:
[pointX, pointY, width, height] = cv2.boundingRect(subContour)
cv2.rectangle(outputImage,
(intX+pointX, intY+pointY),
(intX+width, intY+height),
(0, 255, 0),
2)
cv2.imshow("original", imgInput)
cv2.imshow("rectdilation", rectdilation)
cv2.imshow("threshold", binaryThreshold)
cv2.imshow("outputRect", outputImage)
cv2.waitKey(0);
答案 0 :(得分:1)
一切都是正确的,只有一个小错误: 更改你的第二个cv2.rectangle(在子轮廓中)
cv2.rectangle(outputImage,(intX+pointX, intY+pointY),(intX+pointX+width, intY+pointY+height), (0, 255, 0),2)
不是卑鄙或小气,但这是你自己可以解决的错误;) 对这类代码的调试只是尝试第一个单词,然后是第一个子轮廓,保存图像,检查pointX,intX,width的值......没什么太复杂的,这是你要做的事情。经常需要做程序员。
祝你好运!