Question

您建议使用哪种技术分割此图像中的字符，以准备输入与MNIST数据集一起使用的模型;因为他们一次只占一个角色。这个问题无疑是转换图像及其二值化的重要性。

谢谢！

Answer 1

作为一个起点，我会尝试以下方法：

使用OTSU阈值。
做一些形态操作，以消除噪音并隔离每个数字。
运行连接组件labling。
如果分类得分低，则将每个连接的组件提供给您的分类器以识别数字。
最后验证您希望所有数字在线上或多或少彼此保持一定的距离。

以下是前4个阶段。现在，您需要添加识别软件以识别数字。

import cv2
import numpy as np
from matplotlib import pyplot as plt

# Params
EPSSILON = 0.4
MIN_AREA = 10
BIG_AREA = 75

# Read img
img = cv2.imread('i.jpg',0)

# Otzu threshold
a,thI =  cv2.threshold(img,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)

# Morpholgical
se = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(1,1))
thIMor = cv2.morphologyEx(thI,cv2.MORPH_CLOSE,se)

# Connected compoent labling
stats = cv2.connectedComponentsWithStats(thIMor,connectivity=8)

num_labels = stats[0]
labels = stats[1]
labelStats = stats[2]

# We expect the conneccted compoennt of the numbers to be more or less with a constats ratio
# So we find the medina ratio of all the comeonets because the majorty of connected compoent are numbers
ratios = []
for label in range(num_labels):

    connectedCompoentWidth = labelStats[label,cv2.CC_STAT_WIDTH]
    connectedCompoentHeight = labelStats[label, cv2.CC_STAT_HEIGHT]

    ratios.append(float(connectedCompoentWidth)/float(connectedCompoentHeight))

# Find median ratio
medianRatio = np.median(np.asarray(ratios))

# Go over all the connected component again and filter out compoennt that are far from the ratio
filterdI = np.zeros_like(thIMor)
filterdI[labels!=0] = 255
for label in range(num_labels):

    # Ignore biggest label
    if(label==1):
        filterdI[labels == label] = 0
        continue

    connectedCompoentWidth = labelStats[label,cv2.CC_STAT_WIDTH]
    connectedCompoentHeight = labelStats[label, cv2.CC_STAT_HEIGHT]

    ratio = float(connectedCompoentWidth)/float(connectedCompoentHeight)
    if ratio > medianRatio + EPSSILON or ratio < medianRatio - EPSSILON:
        filterdI[labels==label] = 0

    # Filter small or large compoennt
    if labelStats[label,cv2.CC_STAT_AREA] < MIN_AREA or labelStats[label,cv2.CC_STAT_AREA] > BIG_AREA:
        filterdI[labels == label] = 0

plt.imshow(filterdI)


# Now go over each of the left compoenet and run the number recognotion
stats = cv2.connectedComponentsWithStats(filterdI,connectivity=8)
num_labels = stats[0]
labels = stats[1]
labelStats = stats[2]

for label in range(num_labels):

    # Crop the bounding box around the component
    left = labelStats[label,cv2.CC_STAT_LEFT]
    top = labelStats[label, cv2.CC_STAT_TOP]
    width = labelStats[label, cv2.CC_STAT_WIDTH]
    height = labelStats[label, cv2.CC_STAT_HEIGHT]
    candidateDigit = labels[top:top+height,left:left+width]


    # plt.figure(label)
    # plt.imshow(candidateDigit)

Answer 2

我连接到Amitay的答案。

对于2：我会使用细化作为形态学操作（看thinning algorithm in opencv）

对于3：在OpenCV 3.0中，已经存在一个名为cv::connectedComponents）

的函数

希望有所帮助

Python：图像分割作为分类的预处理

2 个答案: