我要裁剪包含特定颜色的图像的线条。
我已经有以下几行代码来获取特定的颜色。它是铅笔笔触图像中包含的颜色。
# we get the dominant colors
img = cv2.imread('stroke.png')
height, width, dim = img.shape
# We take only the center of the image
img = img[int(height/4):int(3*height/4), int(width/4):int(3*width/4), :]
height, width, dim = img.shape
img_vec = np.reshape(img, [height * width, dim] )
kmeans = KMeans(n_clusters=3)
kmeans.fit( img_vec )
# count cluster pixels, order clusters by cluster size
unique_l, counts_l = np.unique(kmeans.labels_, return_counts=True)
sort_ix = np.argsort(counts_l)
sort_ix = sort_ix[::-1]
fig = plt.figure()
ax = fig.add_subplot(111)
x_from = 0.05
# colors are cluster_center in kmeans.cluster_centers_[sort_ix] I think
然后,我想解析图像的每一行,并将边缘连续笔触的行裁剪在一起。也就是说,至少有一个像素带有示例stroke.png
的一种颜色的行,排除了白色(我尚未实现的功能)。最后从这些行中提取文本。
### Attempt to get the colors of the stroke example
# we get the dominant colors
img = cv2.imread('strike.png')
height, width, dim = img.shape
# We take only the center of the image
img = img[int(height/4):int(3*height/4), int(width/4):int(3*width/4), :]
height, width, dim = img.shape
img_vec = np.reshape(img, [height * width, dim] )
kmeans = KMeans(n_clusters=2)
kmeans.fit( img_vec )
# count cluster pixels, order clusters by cluster size
unique_l, counts_l = np.unique(kmeans.labels_, return_counts=True)
sort_ix = np.argsort(counts_l)
sort_ix = sort_ix[::-1]
fig = plt.figure()
ax = fig.add_subplot(111)
x_from = 0.05
cluster_center = kmeans.cluster_centers_[sort_ix][1]
# plt.show()
### End of attempt
for file_name in file_names:
print("we wrote : ",file_name)
# load the image and convert it to grayscale
image = cv2.imread(file_name)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# check to see if we should apply thresholding to preprocess the
# image
if args["preprocess"] == "thresh":
gray = cv2.threshold(gray, 0, 255,
cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
# make a check to see if median blurring should be done to remove
# noise
elif args["preprocess"] == "blur":
gray = cv2.medianBlur(gray, 3)
# write the grayscale image to disk as a temporary file so we can
# apply OCR to it
filename = "{}.png".format(os.getpid())
cv2.imwrite(filename, gray)
# Here we should split the images in parts. Those who have strokes
# We asked for a stroke example so we have its color
# While we find pixels with the same color we store its line
im = Image.open(filename)
(width, height)= im.size
for x in range(width):
for y in range(height):
rgb_im = im.convert('RGB')
red, green, blue = rgb_im.getpixel((1, 1))
# We test if the pixel has the same color as the second cluster # We should rather test if it is "alike"
# It means that we found a line were there is some paper stroke
if np.array_equal([red,green,blue],cluster_center):
# if it is the case we store the width as starting point while we find pixels
# and we break the loop to go to another line
if start == -1:
start = x
selecting_area = True
break
# if it already started we break the loop to go to another line
if selecting_area == True:
break
# if no pixel in a line had the same color as the second cluster but selecting already started
# we crop the image and go to another line
# it means that there is no more paper stroke
if selecting_area == True:
text_box = (0, start, width, x)
# Crop Image
area = im.crop(text_box)
area.show()
selecting_area = False
break
# load the image as a PIL/Pillow image, apply OCR, and then delete
# the temporary file
text = pytesseract.image_to_string(Image.open(filename))
os.remove(filename)
#print(text)
with open('resume.txt', 'a+') as f:
print('***:', text, file=f)
因此,到目前为止,如果我能够获得想要用于裁剪图像的颜色,我设计给我的测试是要知道实际上要裁剪图像的哪一部分似乎没有结束您能帮我实现它吗?
this paper提出的另一个想法是将笔划分组并分别识别文本,但是我不知道有什么分组算法可以帮助我完成这项工作。
要处理的图像示例:
完整项目(带注释的文本摘要程序)可以在Github here上找到。