我正在尝试使用优化技术而不是DLT(直接线性变换)方法来解决单应性(计算机视觉)问题。这个成本函数的简单说明如下: Illustration of cost function
我在python中实现了这个代价函数: -
def costFunc(H, p1, p2):
cost = 0.
H = H.reshape((3, 3))
H_inv = inv(H)
for i in range(0, p1.shape[0]):
# Forward Transformation
x = p1[i, :]
x_dash = p2[i, :]
x = numpy.reshape(x, [1, 3])
x_dash_estimated = applyTransformation(H, x)
diff = numpy.sum(numpy.square(numpy.subtract(x_dash_estimated, x_dash)))
cost = cost + diff
# # Inverse Transformation
x = p2[i, :]
x_dash = p1[i, :]
x = numpy.reshape(x, [1, 3])
x_dash_estimated = applyTransformation(H_inv, x)
diff = numpy.sum(numpy.square(numpy.subtract(x_dash_estimated, x_dash)))
cost = cost + diff
return cost/p1.shape[0]
这里,H与给定图像的H矩阵(但是平展为1×9矢量)相同,并且p1和p2是4对应的均匀点。它们都是4x3矩阵。图像之间的对应点如下: -
refPt = numpy.array([[[182, 267, 1], [119, 270, 1]],
[[264, 111, 1], [202, 110, 1]],
[[544, 92, 1], [479, 95, 1]],
[[329, 356, 1], [269, 357, 1]]])
每行代表图像中的两个对应点。它们分为p1和p2: -
p1 = refPt[:, 0, :]
p2 = refPt[:, 1, :]
函数applyTransformation
只是将单应性应用于输入点。它被命名为: -
def applyTransformation(H, points):
output = numpy.zeros(shape=[points.shape[0], 3], dtype=numpy.int32)
for i in range(0, points.shape[0]):
temp = numpy.dot(H, points[i, :])
temp = temp / temp[2]
temp[0] = numpy.round(temp[0])
temp[1] = numpy.round(temp[1])
output[i, :] = temp
return output
现在,当我尝试最小化此功能时,我陷入某些局部最小值并且永远不会实现全局最优值。 我的优化代码如下: -
initial_H = numpy.random.rand(3, 3) * 1.
initial_h = initial_H.flatten()
result = optimize.minimize(fun = costFunc,
x0 = initial_h,
args = (p1, p2),
method = 'TNC')
需要一些建议来解决这个问题。