Question

@ALL这是对原始问题的编辑，以便更加了解该主题。

问题陈述

假设有一个工业P＆amp; ID图。
目的只为某些对该过程重要的线条着色。
用户只需在线段上单击（鼠标左键单击）即可将其着色。

问题方法

我是编程新手 - ＆gt;使用Python（3.5）来试试这个。我认为算法的方式是这样的：

该图将采用.pdf格式。因此我可以使用PIL ImageGrab或将.pdf转换为.png，如this example
该算法将搜索鼠标点击周围的像素，然后将其与相同大小的另一部分（比如6x3像素的条带）进行比较，但是向左/向右一步（是1-5 px）< / LI>
检查差异的平均值将告诉我们这两个条带是否相同
这种算法应该找到行结尾，箭头，角落或其他元素
一旦找到，记录的位置和绘制的标记线，用户应该选择另一条线

总结

点击想要的行
在鼠标周围抓取一小部分图像
检查线条是水平线还是垂直线
裁剪给定大小的水平/垂直切片
查找行结尾并记录结尾位置
在两个找到的位置之间绘制一条特定颜色的线条（让我们说绿色）
等待下一行被选中并重复

其他想法

附上你可以找到两张样本图片和我的样子试图实现。
尝试使用此处找到的方法在切片中找到“洞”：OpenCV to find line endings
坚持使用ImageGrab例程或类似的东西没有严格的规则
如果您了解我可以使用的其他策略，请随时发表评论
欢迎任何建议，并衷心感谢

示例图片：

所需结果（在Paint中修改）：

使用我迄今为止尝试的作品添加帖子的更新

我对原始代码做了一些修改，所以我将在下面发布。注释中的所有内容都可用于调试或解释。非常感谢您的帮助！不要害怕干预。

import win32gui as w
from PIL import ImageStat, ImageChops, Image, ImageDraw
import win32api as wa

img=Image.open("Trials.jpg")
img_width=img.size[0]
img_height=img.size[1]
#Using 1920 x 1080 resolution
#Hide the taskbar to center the Photo Viewer
#Defining a way to make sure the mouse click is inside the image
#Substract the width from total and divide by 2 to get base point of the crop
width_lim = (1920 - img_width)/2
height_lim = (1080 - img_height)/2-7
#After several tests, the math in calculating the height is off by 7 pixels, hence the correction
#Use these values when doing the crop

#Check if left mouse button was pressed and record its position
left_p = wa.GetKeyState(0x01)
#print(left_p)
while True :
    a=wa.GetKeyState(0x01)
    if a != left_p:
        left_p = a
        if a<0 :
            pos = w.GetCursorPos()
            pos_x=pos[0]-width_lim
            pos_y=pos[1]-height_lim
#            print(pos_x,pos_y)
        else:
            break


#img.show()
#print(img.size)

#Define the crop height; size is doubled
height_size = 10
#Define max length limit
#Getting a horizontal strip
im_hor = img.crop(box=[0, pos_y-height_size, img_width, pos_y+height_size])
#im_hor.show()



#failed in trying crop a small square of 3x3 size using the pos_x
#sq_size = 3
#st_sq = im_hor.crop(box=[pos_x,0,pos_x+sq_size,height_size*2])
#st_sq.show()

#going back to the code it works
#crop a standard strip and compare with a new test one
#if the mean of difference is zero, the strips are identical
#still looking for a way to find the position of the central pixel (that would be the one with maximum value - black)
strip_len = 3
step = 3
i = pos_x
st_sq = im_hor.crop(box=[i,0,i+strip_len,height_size*2])
test_sq = im_hor.crop(box=[i+step,0,i+strip_len+step,height_size*2])
diff = ImageChops.difference(st_sq,test_sq)
stat=ImageStat.Stat(diff)
mean = stat.mean
mean1 = stat.mean
#print(mean)

#iterate to the right until finding a different strip, record position
while mean==[0,0,0]:
    i = i+1
    st_sq = im_hor.crop(box=[i,0,i+strip_len,height_size*2])
    #st_sq.show()
    test_sq = im_hor.crop(box=[i+step,0,i+strip_len+step,height_size*2])
    #test_sq.show()
    diff = ImageChops.difference(st_sq,test_sq)
    #diff.show()
    stat=ImageStat.Stat(diff)
    mean = stat.mean
#    print(mean)
print(i-1)

r = i-1
#print("STOP")
#print(r)
#record the right end as r = i-1

#iterate to the left until finding a different strip. record the position
while mean1==[0,0,0]:
    i = i-1
    st_sq = im_hor.crop(box=[i,0,i+strip_len,height_size*2])
    #st_sq.show()
    test_sq = im_hor.crop(box=[i+step,0,i+strip_len+step,height_size*2])
    #test_sq.show()
    diff = ImageChops.difference(st_sq,test_sq)
    #diff.show()
    stat=ImageStat.Stat(diff)
    mean1 = stat.mean
#    print(mean)
#print("STOP")
print(i+1)

l = i+1
#record the left end as l=i+1
test_draw = ImageDraw.Draw(img)
test_draw.line([l,pos_y,r,pos_y], fill=128)
img.show()

#find another approach or die trying!!!

以下是我得到的result。这不是我所希望的，但我觉得自己正走在正确的轨道上。我真的可以帮助找到条带中的像素位置并使其相对于大图像像素位置。

另一种image，质量更好，但却给问题带来了更多问题。

Answer 1

所以这个解决方案并不是解决您确切问题的完整解决方案，但我认为这可能是一个很好的方法，至少可以让您获得部分方法。我通常使用线检测方法的问题是它们通常严重依赖于多个超参数。更令人讨厌的是，它们很慢，因为它们正在搜索各种各样的角度;你的线条严格地是水平的或垂直的。因此，我建议使用形态学。您可以找到形态on the OpenCV site的一般概述，您可以看到它应用于删除this tutorial on the OpenCV site分数中的音乐栏。

我认为的基本想法是：

检测水平和垂直线
在检测到的行上运行connectedComponents()以分别识别每一行
获取用户鼠标位置并在其周围定义一个窗口
如果连接组件中的标签位于该窗口中，则抓取该组件
在图像上绘制该组件

现在，这是一个非常基本的想法，忽略了所涉及的一些挑战。但是，这个肯定会做的是，如果你点击任何地方并且你的图像中的一行在该点击的窗口内，你将得到它。这里没有错过的线路。另外一个好消息是，它不会忽略较粗的边框，在图像中您自然希望停止这种情况（请注意线检测方案存在此问题）。这将只检测具有已定义宽度的线条，如果线条变粗（变成箭头或撞向不同方向的线条），则会将其切断。坏消息是，它为您的线条使用预定义的宽度。您可以通过使用命中或未命中转换来解决此问题，但请注意，对于早于3.3-rc的OpenCV版本，该实现目前已被破坏;请参阅here了解更多信息（您可以轻松绕过破碎的实施）。无论如何，这里的命中或未命中变换允许你说＆＃34;我想要一条水平线，但它可以是几个像素宽或只有一个像素宽＆＃34;。当然，你创造的越广泛，就不会有更多的东西变成一条线。您可以稍后根据尺寸过滤掉这些（抛出所有小于某个尺寸的线条，并进行侵蚀或扩张）。

现在代码中的内容是什么样的？我决定做一个简单的例子并应用它，但请注意代码被抛在一起，所以这里没有真正的错误，并且你想要写得更好。无论哪种方式，它只是一个快速的黑客来举一个上述方法的例子。

首先，我们将创建图像并绘制一些线条：

import cv2
import numpy as np 

img = 255*np.ones((500, 500), dtype=np.uint8)
cv2.line(img, (10, 350), (200, 350), color=0, thickness=1)
cv2.line(img, (100, 150), (400, 150), color=0, thickness=1)
cv2.line(img, (300, 250), (300, 500), color=0, thickness=1)
cv2.line(img, (100, 50), (100, 350), color=0, thickness=1)
bin_img = cv2.bitwise_not(img)

请注意，我也创建了相反的图像，因为我更喜欢保留我试图检测白色的东西，而黑色是背景。

现在我们将抓住那些具有形态的水平和垂直线（在这种情况下是侵蚀）：

h_kernel = np.array([[0, 0, 0],
                     [1, 1, 1],
                     [0, 0, 0]], dtype=np.uint8)
v_kernel = np.array([[0, 1, 0],
                     [0, 1, 0],
                     [0, 1, 0]], dtype=np.uint8)

h_lines = cv2.morphologyEx(bin_img, cv2.MORPH_ERODE, h_kernel)
v_lines = cv2.morphologyEx(bin_img, cv2.MORPH_ERODE, v_kernel)

现在我们将标记每一行：

h_n, h_labels = cv2.connectedComponents(h_lines)
v_n, v_labels = cv2.connectedComponents(v_lines)

这些图片h_labels和v_labels将与h_lines和v_lines相同，但不是每个像素的颜色/值为白色，而是整数对于图像中的每个不同组件。因此背景像素的值为0，一行标记为1，另一行标记为2。对于包含更多行的图像等等。

现在，我们将围绕用户鼠标单击定义一个窗口。我没有在这里实现该管道，而只是硬编码鼠标点击位置：

mouse_click = [101, 148]  # x, y
click_radius = 3  # pixel width around mouse click
window = [[mouse_click[0] - i, mouse_click[1] - j]
          for i in range(-click_radius, click_radius+1)
          for j in range(-click_radius, click_radius+1)]

要做的最后一件事是遍历window内的所有位置，并检查那里的标签是否为正（即它不是背景）。如果是，那么我们就行了。所以现在我们可以只查看具有该标签的所有像素，这将是整行。然后我们可以使用任意数量的方法在原始img上绘制线条。

label = 0
for pixel in window:
    if h_labels[pixel[1], pixel[0]] > 0:
        label = h_labels[pixel[1], pixel[0]]
        bin_labeled = 255*(h_labels == label).astype(np.uint8)
    elif v_labels[pixel[1], pixel[0]] > 0:
        label = v_labels[pixel[1], pixel[0]]
        bin_labeled = 255*(v_labels == label).astype(np.uint8)
    if label > 0:
        rgb_labeled = cv2.merge([img, img+bin_labeled, img])
        break

IMO这个代码直接在上面是非常草率的，有更好的方法来绘制这个，但我并不想花时间在一些不是真正重要问题的东西上。

改善这种情况的一个简单方法是连接近线---你可以在找到组件之前用形态学做到这一点。一种更好的绘制方法可能是简单地在图像中找到该标签的最小/最大位置，并将它们用作OpenCV的line()函数绘制的端点坐标，这样就可以了轻松选择颜色和线条粗细。如果可能的话，我建议做的一件事就是在用户点击之前在鼠标悬停上显示这些行（因此他们知道他们正在点击右侧区域）。这样，如果用户接近两行，他们就知道他们正在选择哪一行。

单击鼠标查找线条并着色

问题陈述

问题方法

总结

其他想法

使用我迄今为止尝试的作品添加帖子的更新

1 个答案: