有什么方法可以在numpy上加快我的Daugman运算符功能吗?

时间:2018-10-10 09:14:31

标签: python performance numpy

几天前,我试图找到Daugman虹膜检测算法的python实现。我只找到one repo on github,但是在笔记本电脑上的实现速度非常慢,大约需要46秒。因此,我根据以下公式重写了所有内容:
最终结果快了约85倍(在我的笔记本电脑上为530毫秒),但对于实时视频处理(至少以10 fps的速度)仍然不够快。

我已经阅读了这个12的stackoverflow主题,并试图对代码进行矢量化处理。 我已经测试过map()np.vectorize()np.fromiter(),但是它们并没有比我当前的解决方案快(其中一些甚至更慢),并且我没有主意。

下面是否可以向量化代码,还是需要使用C extensions或尝试在PyPy上运行它?

我的解决方案:

def daugman(center, start_r, gray_img):
    """return maximal intense radius for given center
    center -- tuple(x, y)
    start_r -- int
    gray_img -- grayscale picture as np.array(), it should be square
    """
    # get separate coordinates
    x, y = center
    # get img dimensions
    h, w = gray_img.shape
    # for calculation convinience
    img_shape = np.array([h, w])
    c = np.array(center)
    # define some other vars
    tmp = []
    mask = np.zeros_like(gray)

    # for every radius in range
    # we are presuming that iris will be no bigger than 1/3 of picture
    for r in range(start_r, int(h/3)):
        # draw circle on mask
        cv2.circle(mask, center, r, 255, 1)
        # get pixel from original image
        radii = gray_img & mask  # it is faster than np or cv2
        # normalize
        tmp.append(radii[radii > 0].sum()/(2*3.1415*r))
        # refresh mask
        mask.fill(0)

    # calculate delta of radius intensitiveness
    tmp = np.array(tmp)
    tmp = tmp[1:] - tmp[:-1]
    # aply gaussian filter
    tmp = abs(cv2.GaussianBlur(tmp[:-1], (1, 5), 0))
    # get maximum value
    idx = np.argmax(tmp)
    # return value, center coords, radius
    return tmp[idx], [center, idx + start_r]

def find_iris(gray, start_r):
    """Apply daugman() on every pixel in calculated image slice
        gray -- graysacale img as np.array()
        start_r -- initial radius as int
    Selection of image slice guarantees that every
    radius will be drawn in iage borders, so we need to check it (speed up)

    To speed up the whole process we need to pregenerate all centers for detection
    """
    _, s = gray.shape
    # reduce step for better accuracy
    # 's/3' is the maximum radius of a daugman() search
    a = range(0 + int(s/3), s - int(s/3), 3)
    all_points = list(itertools.product(a, a))

    values = []
    coords = []

    for p in all_points:
        tmp = daugman(p, start_r, gray)
        if tmp is not None:
            val, circle = tmp
            values.append(val)
            coords.append(circle)

    # return the radius with biggest intensiveness delta on image
    # [(xc, yc), radius]
    return coords[np.argmax(values)]

UPD:图片进行测试

UPD 2:我试图创建一个具有所有可能的x,y,r值的张量并在其上映射函数-这样并不快。

张量创建:

# prepare img
img = cv2.imread('eye.jpg')
img = img[20:130, 20:130]
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
start_r = 10
# prepare some vars
h, w = gray.shape
img_shape = np.array([h, w])
mask = np.zeros_like(gray)
# generate points
_, s = gray.shape
a = range(0 + int(s/3), s - int(s/3), 3)
b = range(start_r, int(s/3))
all_points = list(itertools.product(a, a, b))
all_points_arr = np.array(all_points)

针对这样的情况重写道格曼:

def daugman_6(point, gray_img=gray):
    """return maximal intense radius for given center
    center -- tuple(x, y)
    start_r -- int
    gray_img -- grayscale picture as np.array(), it should be square
    """
    # get separate coordinates
    x, y, r = point

    # for every radius in range
    # we are presuming that iris will be no bigger than 1/3 of picture
    # draw circle on mask
    cv2.circle(mask, (x, y), r, 255, 1)
    # get pixel from original image
    radii = gray_img & mask  # it is faster than np or cv2
    # refresh mask
    mask.fill(0)

    return radii[radii > 0].sum()/(2*3.1415*r)

使用Core i9的服务器上的结果:

# iterate via numpy array with for
%%timeit
[daugman_6(i) for i in all_points_arr]
#80.6 ms ± 2.74 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

# np.fromiter() on numpy array
%%timeit
np.fromiter((daugman_6(p) for p in all_points_arr), dtype=np.float32)
#82.9 ms ± 3.75 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

# iteration over list of tuples
%%timeit
[daugman_6(i) for i in all_points]
#70 ms ± 2.86 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

# original implementation
%%timeit
find_iris(gray, 10)
#71.6 ms ± 3.51 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

0 个答案:

没有答案