正如我的标题所暗示的那样,我试图将高斯与某些数据相匹配,而我只是得到一条直线。我一直在关注这些其他讨论Gaussian fit for Python和Fitting a gaussian to a curve in Python,这些讨论似乎基本相同。我可以使这些讨论中的代码适用于他们提供的数据,但它不会为我的数据做这些。
我的代码如下所示:
import pylab as plb
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy import asarray as ar,exp
y = y - y[0] # to make it go to zero on both sides
x = range(len(y))
max_y = max(y)
n = len(y)
mean = sum(x*y)/n
sigma = np.sqrt(sum(y*(x-mean)**2)/n)
# Someone on a previous post seemed to think this needed to have the sqrt.
# Tried it without as well, made no difference.
def gaus(x,a,x0,sigma):
return a*exp(-(x-x0)**2/(2*sigma**2))
popt,pcov = curve_fit(gaus,x,y,p0=[max_y,mean,sigma])
# It was suggested in one of the other posts I looked at to make the
# first element of p0 be the maximum value of y.
# I also tried it as 1, but that did not work either
plt.plot(x,y,'b:',label='data')
plt.plot(x,gaus(x,*popt),'r:',label='fit')
plt.legend()
plt.title('Fig. 3 - Fit for Time Constant')
plt.xlabel('Time (s)')
plt.ylabel('Voltage (V)')
plt.show()
我想要的数据如下:
y = array([ 6.95301373e+12, 9.62971320e+12, 1.32501876e+13,
1.81150568e+13, 2.46111132e+13, 3.32321345e+13,
4.45978682e+13, 5.94819771e+13, 7.88394616e+13,
1.03837779e+14, 1.35888594e+14, 1.76677210e+14,
2.28196006e+14, 2.92781632e+14, 3.73133045e+14,
4.72340762e+14, 5.93892782e+14, 7.41632194e+14,
9.19750269e+14, 1.13278296e+15, 1.38551838e+15,
1.68291212e+15, 2.02996957e+15, 2.43161742e+15,
2.89259207e+15, 3.41725793e+15, 4.00937676e+15,
4.67187762e+15, 5.40667931e+15, 6.21440313e+15,
7.09421973e+15, 8.04366842e+15, 9.05855930e+15,
1.01328502e+16, 1.12585509e+16, 1.24257598e+16,
1.36226443e+16, 1.48356404e+16, 1.60496345e+16,
1.72482199e+16, 1.84140400e+16, 1.95291969e+16,
2.05757166e+16, 2.15360187e+16, 2.23933053e+16,
2.31320228e+16, 2.37385276e+16, 2.42009864e+16,
2.45114362e+16, 2.46427484e+16, 2.45114362e+16,
2.42009864e+16, 2.37385276e+16, 2.31320228e+16,
2.23933053e+16, 2.15360187e+16, 2.05757166e+16,
1.95291969e+16, 1.84140400e+16, 1.72482199e+16,
1.60496345e+16, 1.48356404e+16, 1.36226443e+16,
1.24257598e+16, 1.12585509e+16, 1.01328502e+16,
9.05855930e+15, 8.04366842e+15, 7.09421973e+15,
6.21440313e+15, 5.40667931e+15, 4.67187762e+15,
4.00937676e+15, 3.41725793e+15, 2.89259207e+15,
2.43161742e+15, 2.02996957e+15, 1.68291212e+15,
1.38551838e+15, 1.13278296e+15, 9.19750269e+14,
7.41632194e+14, 5.93892782e+14, 4.72340762e+14,
3.73133045e+14, 2.92781632e+14, 2.28196006e+14,
1.76677210e+14, 1.35888594e+14, 1.03837779e+14,
7.88394616e+13, 5.94819771e+13, 4.45978682e+13,
3.32321345e+13, 2.46111132e+13, 1.81150568e+13,
1.32501876e+13, 9.62971320e+12, 6.95301373e+12,
4.98705540e+12])
我会告诉你它的样子,但显然我没有足够的声望点......
任何人都知道为什么它不合适?
感谢您的帮助:)
答案 0 :(得分:0)
初始猜测p0
在curve_fit
的默认参数列表中的重要性不能过分强调。
请注意docstring提及
[
p0
]如果为None,那么初始值将全部为1
因此,如果您不提供它,它将对您尝试优化的所有参数使用初始猜测1。
p0
的选择会影响基础算法更改猜测向量p0
的速度(参考least_squares的文档)。
当您查看自己拥有的数据时,您会注意到高斯类数据集mu_0
的最大值和平均值y
是
分别为2.4e16
和49
。由于峰值如此之大,算法需要对其初始猜测进行大幅度更改才能达到该值。
当您为曲线拟合算法提供良好的初始猜测时,更有可能发生收敛。
使用您的数据,您可以为peak_value
,mean
和sigma
提供良好的初始猜测,方法是这样写:
y = np.array([...]) # starting from the original dataset
x = np.arange(len(y))
peak_value = y.max()
mean = x[y.argmax()] # observation of the data shows that the peak is close to the center of the interval of the x-data
sigma = mean - np.where(y > peak_value * np.exp(-.5))[0][0] # when x is sigma in the gaussian model, the function evaluates to a*exp(-.5)
popt,pcov = curve_fit(gaus, x, y, p0=[peak_value, mean, sigma])
print(popt) # prints: [ 2.44402560e+16 4.90000000e+01 1.20588976e+01]
请注意,在你的代码中,对于平均值sum(x*y)/n
,这是奇怪的,因为这会通过1阶的多项式调整高斯(它将高斯与单调增加的恒定斜率线相乘)在取平均值之前。这将抵消y
的平均值(在本例中为右侧)。您可以对sigma
。
最后评论:y
的直方图不会像高斯,因为y
已经是高斯。直方图仅将bin
(计数)值分为不同的类别(回答“y
中有多少数据点到达[a, b]
之间的值?”的问题。