Question

我有一组坐标（x，y，z（x，y）），它们描述坐标x，y处的强度（z）。对于不同坐标处的这些强度的设定数量，我需要拟合2D高斯，以最小化均方误差。数据在numpy矩阵中，对于每个拟合会话，我将有4,9,16或25个坐标。最终我只需要获得具有最小MSE的高斯（x_0，y_0）的中心位置。我发现的所有示例都使用scipy.optimize.curve_fit，但它们的输入数据是整个网格而不是几个坐标。任何帮助，将不胜感激。

Answer 1

简介

有多种方法可以解决这个问题。您可以使用非线性方法（例如scipy.optimize.curve_fit），但它们会很慢并且无法收敛。您可以将问题线性化（快速，独特的解决方案），但是＆＃34; tails中的任何噪音都会出现问题。分发会引起问题。实际上有一些技巧可以应用于这个特定情况，以避免后一个问题。我将展示一些例子，但我没有时间展示所有的＆＃34;技巧＆＃34;现在。

正如旁注所示，一般的2D guassian有6个参数，所以你不能完全适应4点的东西。然而，听起来你可能假设x和y之间没有协方差，并且方差在每个方向上是相同的（即完美的＆＃34;圆＆＃34;钟形曲线）。如果是这种情况，那么您只需要四个参数。如果你知道高斯的振幅，你只需要三个。但是，我将从一般解决方案开始，如果您愿意，可以在以后简化它。

目前，让我们专注于使用非线性方法解决此问题（例如scipy.optimize.curve_fit）。

2D guassian的一般方程是（直接来自维基百科）：

enter image description here

其中：

enter image description here 在协方差矩阵上基本上为0.5，A是幅度，和（X 0，Y 0）是中心

生成简化的样本数据

让我们写出上面的等式：

import numpy as np
import matplotlib.pyplot as plt

def gauss2d(x, y, amp, x0, y0, a, b, c):
    inner = a * (x - x0)**2 
    inner += 2 * b * (x - x0)**2 * (y - y0)**2
    inner += c * (y - y0)**2
    return amp * np.exp(-inner)

然后让我们生成一些示例数据。首先，我们将生成一些易于使用的数据：

np.random.seed(1977) # For consistency
x, y = np.random.random((2, 10))
x0, y0 = 0.3, 0.7
amp, a, b, c = 1, 2, 3, 4

zobs = gauss2d(x, y, amp, x0, y0, a, b, c)

fig, ax = plt.subplots()
scat = ax.scatter(x, y, c=zobs, s=200)
fig.colorbar(scat)
plt.show()

enter image description here

请注意，我们还没有添加任何噪音，并且分布的中心在我们拥有数据的范围内（即中心为0.3,0.7，x和y的散射在0和1之间）。现在，让我们坚持下去，然后我们会看到当我们添加噪音并改变中心时会发生什么。

非线性拟合

首先，让我们使用scpy.optimize.curve_fit预先形成适合高斯函数的非线性最小二乘法。（在旁注中，您可以使用scipy.optimize中的一些其他函数来使用精确的最小化算法。）

scipy.optimize函数期望的函数签名与我们上面编写的函数签名略有不同。我们可以写一个包装来＆＃34;翻译＆＃34;，但是让我们重新编写gauss2d函数：

def gauss2d(xy, amp, x0, y0, a, b, c):
    x, y = xy
    inner = a * (x - x0)**2
    inner += 2 * b * (x - x0)**2 * (y - y0)**2
    inner += c * (y - y0)**2
    return amp * np.exp(-inner)

我们所做的就是将函数期望自变量（x＆amp; y）作为单个2xN数组。

现在我们需要初步猜测一下guassian曲线的参数究竟是什么。这是可选的（默认为全部，如果我没记错的话），但如果1,1不是特别接近＆＃34; true＆＃34;那么你很可能会出现收敛问题。高斯曲线的中心。出于这个原因，我们将使用我们最大观察到的z值的x和y值作为中心的起点。我将其余参数保留为1，但如果您知道他们可能会始终存在显着差异，请将其更改为更合理的参数。

这是完整的，独立的例子：

import numpy as np
import scipy.optimize as opt
import matplotlib.pyplot as plt

def main():
    x0, y0 = 0.3, 0.7
    amp, a, b, c = 1, 2, 3, 4
    true_params = [amp, x0, y0, a, b, c]
    xy, zobs = generate_example_data(10, true_params)
    x, y = xy

    i = zobs.argmax()
    guess = [1, x[i], y[i], 1, 1, 1]
    pred_params, uncert_cov = opt.curve_fit(gauss2d, xy, zobs, p0=guess)

    zpred = gauss2d(xy, *pred_params)
    print 'True parameters: ', true_params
    print 'Predicted params:', pred_params
    print 'Residual, RMS(obs - pred):', np.sqrt(np.mean((zobs - zpred)**2))

    plot(xy, zobs, pred_params)
    plt.show()

def gauss2d(xy, amp, x0, y0, a, b, c):
    x, y = xy
    inner = a * (x - x0)**2
    inner += 2 * b * (x - x0)**2 * (y - y0)**2
    inner += c * (y - y0)**2
    return amp * np.exp(-inner)

def generate_example_data(num, params):
    np.random.seed(1977) # For consistency
    xy = np.random.random((2, num))

    zobs = gauss2d(xy, *params)
    return xy, zobs

def plot(xy, zobs, pred_params):
    x, y = xy
    yi, xi = np.mgrid[:1:30j, -.2:1.2:30j]
    xyi = np.vstack([xi.ravel(), yi.ravel()])

    zpred = gauss2d(xyi, *pred_params)
    zpred.shape = xi.shape

    fig, ax = plt.subplots()
    ax.scatter(x, y, c=zobs, s=200, vmin=zpred.min(), vmax=zpred.max())
    im = ax.imshow(zpred, extent=[xi.min(), xi.max(), yi.max(), yi.min()],
                   aspect='auto')
    fig.colorbar(im)
    ax.invert_yaxis()
    return fig

main()

enter image description here

在这种情况下，我们确切地（ish）恢复我们原来的＆＃34; true＆＃34;参数。

True parameters:  [1, 0.3, 0.7, 2, 3, 4]
Predicted params: [ 1.   0.3  0.7  2.   3.   4. ]
Residual, RMS(obs - pred): 1.01560615193e-16

正如我们在一秒钟内看到的那样，情况总是如此......

添加噪音

让我们在观察中添加一些噪音。我在这里所做的就是更改generate_example_data功能：

def generate_example_data(num, params):
    np.random.seed(1977) # For consistency
    xy = np.random.random((2, num))

    noise = np.random.normal(0, 0.3, num)
    zobs = gauss2d(xy, *params) + noise
    return xy, zobs

然而，结果看起来完全不同：

enter image description here

就参数而言：

True parameters:  [1, 0.3, 0.7, 2, 3, 4]
Predicted params: [  1.129    0.263   0.750   1.280   32.333   10.103  ]
Residual, RMS(obs - pred): 0.152444640098

预测中心没有太大变化，但b和c参数发生了很大变化。

如果我们将函数的中心更改为稍微超出点分散的某处：

x0, y0 = -0.3, 1.1

因为存在噪音，我们会完全胡说八道！（它仍能在没有噪音的情况下正常工作。）

True parameters:  [1, -0.3, 1.1, 2, 3, 4]
Predicted params: [  0.546  -0.939   0.857  -0.488  44.069  -4.136]
Residual, RMS(obs - pred): 0.235664449826

enter image description here

当拟合衰减为零的函数时，这是一个常见问题。＆＃34;尾巴中的任何噪音＆＃34;可能会导致非常糟糕的结果。有很多策略可以解决这个问题。最简单的方法之一是通过观察到的z值来加权反演。以下是1D案例的示例:(专注于线性化问题）How can I perform a least-squares fitting over multiple data sets fast?如果我以后有时间，我会为2D案例添加一个例子。

2D Gaussian适合Python中某些坐标的强度

1 个答案:

简介

生成简化的样本数据

非线性拟合

添加噪音