Question

（对不起成本函数可能是错误的术语）

我正在尝试拟合一些质谱数据，我知道大多数峰值是什么，但不是全部。因此，当我完成时，我预计会有一个积极的残余。看起来大多数scipy拟合算法只是试图最小化rms残差，所以我没有看到一种惩罚负残差的方法，而不是正面。

为了清楚起见，这里有一些简单的代码：

import numpy as np
from scipy import optimize
import matplotlib.pyplot as plt


def gaussian(x, a, x0, sigma):
    return a * np.exp(-(x - x0)**2.0 / (2.0 * sigma**2))


def get_y_with_noise(x, peaks):
    """Return your input spectrum with some lazy shot like noise."""
    n = len(x)
    y = np.zeros(n)
    for peak in peaks:
        y += gaussian(x, *peak)
    # Add some noise.
    y += np.random.randn(n) * 0.05 * np.sqrt(y)
    y[np.where(y < 0.0)] = 0.0
    return y


def fit_peaks_and_plot(x, peaks):
    """Generate the fit some peaks."""

    y = get_y_with_noise(x, peaks)
    # Make a really good guess of starting params.
    p0 = peaks[0]

    # Fit the data.
    popt, pcov = optimize.curve_fit(gaussian, x, y, p0=p0)

    # Plot residuals. Look works great.
    plt.figure()
    plt.plot(x, y)
    plt.plot(x, gaussian(x, *popt))
    plt.plot(x, y - gaussian(x, *popt))
    plt.show()


# Define our data range.
x = np.arange(-10.0, 10.0, 0.01)
# Some peaks that separate nicely.
peaks_1 = [[1.0, 0.0, 1.0], [0.5, 5.0, 1.0]]
# Some peaks that are too close for comforter.
peaks_2 = [[1.0, 0.0, 1.0], [0.5, 2.0, 1.0]]

# Set up some peak.
fit_peaks_and_plot(x, peaks_1)
fit_peaks_and_plot(x, peaks_2)

第一组峰很好地分开。第二组峰值重叠，因此我们尝试使用高斯拟合非高斯，并留下显着的负残差。

我想添加修改成本函数来惩罚负面残差，然后再积极。

我相信在我的例子中，curve_fit试图最小化：

np.sum( ((f(xdata, *popt) - ydata) / sigma)**2 )

作为玩具模型，您可以尝试最小化：

weighted_res = (f(xdata, *popt) - ydata) / sigma
weighted_res[np.where(weighted_res < 0.0)] *= 10.0
np.sum(weighted_res)

显然，我可以定义一个返回weighted_res并尝试将其调整为零的函数，但这似乎是一种非常圆的方法。

Scipy优化自定义成本函数

0 个答案: