我的python代码如下......它需要永远。必须有一些我可以使用的numpy技巧?我正在分析的图片很小并且是灰度级的......
def gaussian_probability(x,mean,standard_dev):
termA = 1.0 / (standard_dev*np.sqrt(2.0*np.pi))
termB = np.exp(-((x - mean)**2.0)/(2.0*(standard_dev**2.0)))
g = (termA*termB)
return g
def sum_of_gaussians(x):
return sum([self.mixing_coefficients[i] *
gaussian_probability(x, self.means[i], self.variances[i]**0.5)
for i in range(self.num_components)])
def expectation():
dim = self.image_matrix.shape
rows, cols = dim[0], dim[1]
responsibilities = []
for i in range(self.num_components):
gamma_k = np.zeros([rows, cols])
for j in range(rows):
for k in range(cols):
p = (self.mixing_coefficients[i] *
gaussian_probability(self.image_matrix[j,k],
self.means[i],
self.variances[i]**0.5))
gamma_k[j,k] = p / sum_of_gaussians(self.image_matrix[j,k])
responsibilities.append(gamma_k)
return responsibilities
我只包括期望步骤,因为,虽然最大化步骤遍历矩阵责任数组的每个元素,但它似乎相对较快(所以瓶颈可能是所有高斯概率计算?)
答案 0 :(得分:4)
您可以通过以下两件事来大大加快计算速度:
不计算每个循环内的规范化!如目前所写,对于具有M个分量的NxN图像,您计算每个相关的计算N * N * M
次,从而产生O[N^4 M^2]
算法!相反,你应该计算所有元素一次,然后除以总和,即O[N^2 M]
。
使用numpy vectorization而不是显式循环。这可以非常直接地按照您设置代码的方式完成。
基本上,您的expectation
函数应如下所示:
def expectation(self):
responsibilities = (self.mixing_coefficients[:, None, None] *
gaussian_probability(self.image_matrix,
self.means[:, None, None],
self.variances[:, None, None] ** 0.5))
return responsibilities / responsibilities.sum(0)
你没有提供一个完整的例子,所以我不得不即兴进行检查并对此进行基准测试,但这里是一个快速的例子:
import numpy as np
def gaussian_probability(x,mean,standard_dev):
termA = 1.0 / (standard_dev*np.sqrt(2.0*np.pi))
termB = np.exp(-((x - mean)**2.0)/(2.0*(standard_dev**2.0)))
return termA * termB
class EM(object):
def __init__(self, N=5):
self.image_matrix = np.random.rand(20, 20)
self.num_components = N
self.mixing_coefficients = 1 + np.random.rand(N)
self.means = 10 * np.random.rand(N)
self.variances = np.ones(N)
def sum_of_gaussians(self, x):
return sum([self.mixing_coefficients[i] *
gaussian_probability(x, self.means[i], self.variances[i]**0.5)
for i in range(self.num_components)])
def expectation(self):
dim = self.image_matrix.shape
rows, cols = dim[0], dim[1]
responsibilities = []
for i in range(self.num_components):
gamma_k = np.zeros([rows, cols])
for j in range(rows):
for k in range(cols):
p = (self.mixing_coefficients[i] *
gaussian_probability(self.image_matrix[j,k],
self.means[i],
self.variances[i]**0.5))
gamma_k[j,k] = p / self.sum_of_gaussians(self.image_matrix[j,k])
responsibilities.append(gamma_k)
return responsibilities
def expectation_fast(self):
responsibilities = (self.mixing_coefficients[:, None, None] *
gaussian_probability(self.image_matrix,
self.means[:, None, None],
self.variances[:, None, None] ** 0.5))
return responsibilities / responsibilities.sum(0)
现在我们可以实例化对象并比较期望步骤的两个实现:
em = EM(5)
np.allclose(em.expectation(),
em.expectation_fast())
# True
从时间上看,对于包含5个组件的20x20图像,我们的速度提高了1000倍:
%timeit em.expectation()
10 loops, best of 3: 65.9 ms per loop
%timeit em.expectation_fast()
10000 loops, best of 3: 74.5 µs per loop
随着图像尺寸和组件数量的增加,这种改进将会增加。 祝你好运!