Numpy PIL Python:在空白上裁剪图像或使用直方图阈值裁剪文本

时间:2014-07-10 23:09:08

标签: python numpy matplotlib python-imaging-library

我如何找到下图中数字周围空白区域的边界框或窗口?:

原始图片:

enter image description here

高度:762像素 宽度:1014像素

目标:

类似于:{x-bound:[x-upper,x-lower], y-bound:[y-upper,y-lower]}所以我可以剪切到文本并输入tesseract或一些OCR。

尝试:

我曾想过将图像分割成硬编码的块大小并随机分析,但我认为它太慢了。

使用pyplot改编自(Using python and PIL how can I grab a block of text in an image?)的示例代码:

from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
im = Image.open('/home/jmunsch/Pictures/Aet62.png')
p = np.array(im)
p = p[:,:,0:3]
p = 255 - p
lx,ly,lz = p.shape

plt.plot(p.sum(axis=1))
plt.plot(p.sum(axis=0))

#I was thinking something like this 
#The image is a 3-dimensional ndarray  [[x],[y],[color?]]
#Set each value below an axes mean to 0
[item = 0 for item in p[axis=0] if item < p.mean(axis=0)]

# and then some type of enumerated groupby for each axes
#finding the mean index for each groupby(0) on axes

plt.plot(p[mean_index1:mean_index2,mean_index3:mean_index4])

根据图表,每个山谷都会指示一个受限制的地方。

  • 第一张图表显示了文字行
  • 的位置
  • 第二张图表显示字符
  • 的位置

绘图示例plt.plot(p.sum(axis=1))

enter image description here

绘制示例输出plt.plot(p.sum(axis=0))

enter image description here

相关帖子/文档:

更新:HYRY的解决方案

enter image description here

1 个答案:

答案 0 :(得分:5)

我认为你可以在scipy.ndimage中使用形态学函数,这是一个例子:

import pylab as pl
import numpy as np
from scipy import ndimage
img = pl.imread("Aet62.png")[:, :, 0].astype(np.uint8)
img2 = ndimage.binary_erosion(img, iterations=40)
img3 = ndimage.binary_dilation(img2, iterations=40)
labels, n = ndimage.label(img3)
counts = np.bincount(labels.ravel())
counts[0] = 0
img4 = labels==np.argmax(counts)
img5 = ndimage.binary_fill_holes(img4)
result = ~img & img5
result = ndimage.binary_erosion(result, iterations=3)
result = ndimage.binary_dilation(result, iterations=3)
pl.imshow(result, cmap="gray")

输出是:

enter image description here