使用PIL和numpy进行慢速python图像处理

时间:2012-02-01 23:27:17

标签: python image-processing numpy python-imaging-library

我试图用PIL和Numpy在Python中实现一些图像处理(找到相似颜色的区域)。无法弄清楚如何加速此代码。你能帮忙吗?

def findRegions(self, data):
#data is numpy.array
    ret = [[False for _ in range(self.width)] for _ in range(self.heigth)]

    for i in range(self.heigth):
        for j in range(self.width):
            k = 0
            acc = 0
            for x,y in [(-1,0),(0,-1),(0,1),(1,0)]:
                if (self.heigth>i+x>=0 and self.width>j+y>=0):
                    k = k+1
                    acc += math.sqrt(sum((data[i][j][c]-data[i+x][j+y][c])**2 for c in range(3)))
            if (acc/k<self.threshold):
                ret[i][j]= True
    return ret 

PIL和其他图像库有很多过滤和处理功能,非常快。但是,实现自己的图像处理功能的最佳方法是什么?

2 个答案:

答案 0 :(得分:4)

您可以将数组向左,向右,向上和向下移动适当数量的元素,而不是在每个行和列上循环。在每个班次中,您将在基础数组中累积值。在移动和累积之后,您计算平均值并应用阈值以返回掩码。请参阅此post,其中对该主题进行了一般性讨论。这个想法是利用numpy的广播,它将一个函数或运算符应用于C中的所有元素而不是Python。

我已经调整了链接帖子中的代码,以适应我认为你想要完成的任务。在任何情况下,一般模式都应该加快速度。您必须弄清楚如何处理返回掩码中的边缘。在这里,我只是将返回掩码设置为False,但您也可以通过在每个方向上将输入数据扩展一个像素并填充最近的像素,零,灰色等来消除边缘。

def findRegions(self,data):
    #define the shifts for the kernel window
    shifts = [(-1,0),(0,-1),(0,1),(1,0)]

    #make the base array of zeros 
    #  array size by 2 in both dimensions
    acc = numpy.zeros(data.shape[:2])

    #compute the square root of the sum of squared color 
    # differences between a pixel and it's 
    # four cardinal neighbors
    for dx,dy in shifts:
        xstop = -1+dx or None
        ystop = -1+dy or None
        #per @Bago's comment, use the sum method to add up the color dimension
        #  instead of the list comprehension
        acc += ((data[1:-1,1:-1] - data[1+dx:xstop, 1+dy:ystop])**2).sum(-1)**.5

    #compute the average 
    acc /= (len(shifts) + 1)

    #build a mask array the same size as the original
    ret = numpy.zeros(data.shape[:2],dtype=numpy.bool)

    #apply the threshold
    #  note that the edges will be False
    ret[1:-1,1:-1] acc < self.threshold    

    return ret

答案 1 :(得分:2)

http://scikits-image.org中包含更好的细分算法,但如果您想构建自己的细分算法,可以根据聚类(称为ICM细分)查看此示例。指定N = 4以识别四个区域。

import numpy as np
from scipy.cluster.vq import kmeans2

def ICM(data, N, beta):
    print "Performing ICM segmentation..."

    # Initialise segmentation using kmeans
    print "K-means initialisation..."
    clusters, labels = kmeans2(np.ravel(data), N)

    print "Iterative segmentation..."
    f = data.copy()

    def _minimise_cluster_distance(data, labels, N, beta):
        data_flat = np.ravel(data)
        cluster_means = np.array(
            [np.mean(data_flat[labels == k]) for k in range(N)]
            )
        variance = np.sum((data_flat - cluster_means[labels])**2) \
                   / data_flat.size

        # How many of the 8-connected neighbouring pixels are in the
        # same cluster?
        count = np.zeros(data.shape + (N,), dtype=int)
        count_inside = count[1:-1, 1:-1, :]

        labels_img = labels.reshape(data.shape)
        for k in range(N):
            count_inside[..., k] += (k == labels_img[1:-1:, 2:])
            count_inside[..., k] += (k == labels_img[2:, 1:-1])
            count_inside[..., k] += (k == labels_img[:-2, 1:-1])
            count_inside[..., k] += (k == labels_img[1:-1, :-2])

            count_inside[..., k] += (k == labels_img[:-2, :-2])
            count_inside[..., k] += (k == labels_img[2:, 2:])
            count_inside[..., k] += (k == labels_img[:-2, 2:])
            count_inside[..., k] += (k == labels_img[2:, :-2])

        count = count.reshape((len(labels), N))
        cluster_measure = (data_flat[:, None] - cluster_means)**2 \
                          - beta * variance * count
        labels = np.argmin(cluster_measure, axis=1)

        return cluster_means, labels

    # Initialise segmentation
    cluster_means, labels = _minimise_cluster_distance(f, labels, N, 0)

    stable_counter = 0
    old_label_diff = 0
    i = 0
    while stable_counter < 3:
        i += 1

        cluster_means, labels_ = \
                       _minimise_cluster_distance(f, labels, N, beta)

        new_label_diff = np.sum(labels_ != labels)
        if  new_label_diff != old_label_diff:
            stable_counter = 0
        else:
            stable_counter += 1
        old_label_diff = new_label_diff

        labels = labels_

    print "Clustering converged after %d steps." % i

    return labels.reshape(data.shape)