拟合高斯,得到一条直线。 Python 2.7

时间:2015-04-13 06:40:35

标签: python-2.7 curve-fitting gaussian

正如我的标题所暗示的那样,我试图将高斯与某些数据相匹配,而我只是得到一条直线。我一直在关注这些其他讨论Gaussian fit for PythonFitting a gaussian to a curve in Python,这些讨论似乎基本相同。我可以使这些讨论中的代码适用于他们提供的数据,但它不会为我的数据做这些。

我的代码如下所示:

import pylab as plb
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy import asarray as ar,exp

y = y - y[0]    # to make it go to zero on both sides
x = range(len(y))

max_y = max(y)

n = len(y)
mean = sum(x*y)/n
sigma = np.sqrt(sum(y*(x-mean)**2)/n)
# Someone on a previous post seemed to think this needed to have the sqrt.
# Tried it without as well, made no difference. 

def gaus(x,a,x0,sigma):
    return a*exp(-(x-x0)**2/(2*sigma**2))

popt,pcov = curve_fit(gaus,x,y,p0=[max_y,mean,sigma])    
# It was suggested in one of the other posts I looked at to make the
# first element of p0 be the maximum value of y.
# I also tried it as 1, but that did not work either

plt.plot(x,y,'b:',label='data')
plt.plot(x,gaus(x,*popt),'r:',label='fit')
plt.legend()
plt.title('Fig. 3 - Fit for Time Constant')
plt.xlabel('Time (s)')
plt.ylabel('Voltage (V)')
plt.show()

我想要的数据如下:

y = array([  6.95301373e+12,   9.62971320e+12,   1.32501876e+13,
     1.81150568e+13,   2.46111132e+13,   3.32321345e+13,
     4.45978682e+13,   5.94819771e+13,   7.88394616e+13,
     1.03837779e+14,   1.35888594e+14,   1.76677210e+14,
     2.28196006e+14,   2.92781632e+14,   3.73133045e+14,
     4.72340762e+14,   5.93892782e+14,   7.41632194e+14,
     9.19750269e+14,   1.13278296e+15,   1.38551838e+15,
     1.68291212e+15,   2.02996957e+15,   2.43161742e+15,
     2.89259207e+15,   3.41725793e+15,   4.00937676e+15,
     4.67187762e+15,   5.40667931e+15,   6.21440313e+15,
     7.09421973e+15,   8.04366842e+15,   9.05855930e+15,
     1.01328502e+16,   1.12585509e+16,   1.24257598e+16,
     1.36226443e+16,   1.48356404e+16,   1.60496345e+16,
     1.72482199e+16,   1.84140400e+16,   1.95291969e+16,
     2.05757166e+16,   2.15360187e+16,   2.23933053e+16,
     2.31320228e+16,   2.37385276e+16,   2.42009864e+16,
     2.45114362e+16,   2.46427484e+16,   2.45114362e+16,
     2.42009864e+16,   2.37385276e+16,   2.31320228e+16,
     2.23933053e+16,   2.15360187e+16,   2.05757166e+16,
     1.95291969e+16,   1.84140400e+16,   1.72482199e+16,
     1.60496345e+16,   1.48356404e+16,   1.36226443e+16,
     1.24257598e+16,   1.12585509e+16,   1.01328502e+16,
     9.05855930e+15,   8.04366842e+15,   7.09421973e+15,
     6.21440313e+15,   5.40667931e+15,   4.67187762e+15,
     4.00937676e+15,   3.41725793e+15,   2.89259207e+15,
     2.43161742e+15,   2.02996957e+15,   1.68291212e+15,
     1.38551838e+15,   1.13278296e+15,   9.19750269e+14,
     7.41632194e+14,   5.93892782e+14,   4.72340762e+14,
     3.73133045e+14,   2.92781632e+14,   2.28196006e+14,
     1.76677210e+14,   1.35888594e+14,   1.03837779e+14,
     7.88394616e+13,   5.94819771e+13,   4.45978682e+13,
     3.32321345e+13,   2.46111132e+13,   1.81150568e+13,
     1.32501876e+13,   9.62971320e+12,   6.95301373e+12,
     4.98705540e+12])

我会告诉你它的样子,但显然我没有足够的声望点......

任何人都知道为什么它不合适?

感谢您的帮助:)

1 个答案:

答案 0 :(得分:0)

初始猜测p0curve_fit的默认参数列表中的重要性不能过分强调。

请注意docstring提及

  

[p0]如果为None,那么初始值将全部为1

因此,如果您不提供它,它将对您尝试优化的所有参数使用初始猜测1。 p0的选择会影响基础算法更改猜测向量p0的速度(参考least_squares的文档)。

当您查看自己拥有的数据时,您会注意到高斯类数据集mu_0的最大值和平均值y是 分别为2.4e1649。由于峰值如此之大,算法需要对其初始猜测进行大幅度更改才能达到该值。

当您为曲线拟合算法提供良好的初始猜测时,更有可能发生收敛。

使用您的数据,您可以为peak_valuemeansigma提供良好的初始猜测,方法是这样写:

y = np.array([...])  # starting from the original dataset
x = np.arange(len(y))
peak_value = y.max()
mean = x[y.argmax()] # observation of the data shows that the peak is close to the center of the interval of the x-data
sigma = mean - np.where(y > peak_value * np.exp(-.5))[0][0] # when x is sigma in the gaussian model, the function evaluates to a*exp(-.5)
popt,pcov = curve_fit(gaus, x, y, p0=[peak_value, mean, sigma])
print(popt)  # prints: [  2.44402560e+16   4.90000000e+01   1.20588976e+01]

请注意,在你的代码中,对于平均值sum(x*y)/n,这是奇怪的,因为这会通过1阶的多项式调整高斯(它将高斯与单调增加的恒定斜率线相乘)在取平均值之前。这将抵消y的平均值(在本例中为右侧)。您可以对sigma

的计算进行类似的评论

最后评论:y的直方图不会像高斯,因为y已经是高斯。直方图仅将bin(计数)值分为不同的类别(回答“y中有多少数据点到达[a, b]之间的值?”的问题。