(此任务使用jupyter笔记本系统)
不是
拟合希格斯质量-在下面给出一个fitter(xvalues,data,init)函数,编写一个函数fitfunc(...),描述组合的背景和信号模型以拟合数据。创建两张图片:
(a)在第一个图上用十字标记(“ +”符号)和最佳拟合曲线作为红线绘制数据,并
(b)在第二个绘图上绘制带有十字标记的残差,其中残差定义为最佳拟合模型与纯背景模型之间的差异,请参见下文。
fit函数由具有3个参数的背景模型组成
?(?)=?* exp(?1(?−105.5)+?2(?−105.5)^ 2)
信号被添加到背景,其模型为
?(?)=?/(?√(2?))* exp(-(?-?)^ 2 /(2? ^ 2))
方程式不是问题,很容易将其放入代码中,就像我在下面所做的那样:
# YOUR CODE HERE
import math
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def fitfunc(m, mu, sigma, R, A, b1, b2):
tb1 = b1 * (m - 105.5)
tb2 = b2 * ((m-105.5)**2)
b = A * np.exp(tb1 + tb2)
ts1 = R / (sigma * np.sqrt(2 * np.pi))
ts2 = -(((m - mu)**2) / (2 * (sigma**2)))
s = ts1 * np.exp(ts2)
tot = b + s
return tot
#
def fitter(xval, yval, initial):
''' function to fit the given data using a 'fitfunc' TBD.
The curve_fit function is called. Only the best fit values
are returned to be utilized in a main script.
'''
best, _ = curve_fit(fitfunc, xval, yval, p0=initial)
return best
# Use functions with script below for plotting parts (a) and (b)
已经提供了钳工方法,所以我认为不应该更改它。
这是我绘制结果的代码:
# start value parameter definitions, see equations for s(m) and b(m).
# init[0] = mu
# init[1] = sigma
# init[2] = R
# init[3] = A
# init[4] = b1
# init[5] = b2
init = (125.8, 1.4, 470.0, 5000.0, -0.04, -1.5e-4)
xvalues = np.arange(start=105.5, stop=160.5, step=1)
data = np.array([4780, 4440, 4205, 4150, 3920, 3890, 3590, 3460, 3300, 3200, 3000,
2950, 2830, 2700, 2620, 2610, 2510, 2280, 2330, 2345, 2300, 2190,
2080, 1990, 1840, 1830, 1730, 1680, 1620, 1600, 1540, 1505, 1450,
1410, 1380, 1380, 1250, 1230, 1220, 1110, 1110, 1080, 1055, 1050,
940, 920, 950, 880, 870, 850, 800, 820, 810, 770, 760])
# YOUR CODE HERE
def main():
arr = np.ndarray(init)
fitt = fitfunc(xvalues, init[0], init[1], init[2], init[3], init[4], init[5])
def plota(xval, yval):
fig = plt.figure()
axis1 = fig.add_axes([0.12, 0.1, 0.85, 0.85])
axis1.plot(xval, yval, marker="+", color="red")
axis1.set_title("Combined", size=12)
axis1.set_xlabel("Mass [GeV]", size=12)
plt.show()
return
plota(xvalues, fitt)
plota(xvalues, fitter(xvalues, fitt, arr))
main()
在第二个代码段中,我的代码在“ #YOUR CODE HERE”之后开始,其余代码已经提供。
最后,第一次调用plota()是找到的数据点的曲线,第二次调用是我尝试按照(a)的要求绘制“最佳拟合曲线”。第一次调用会很好,但不是问题所要的。这会产生类型错误:“'float'对象无法解释为整数”。我也尝试将它们四舍五入为整数,但是却收到此错误:“ fitfunc()缺少6个必需的位置参数:'mu','sigma','R','A','b1'和'b2' ”。我认为第二次调用是正确的,但我不知道fitter方法的第三个参数应该是什么。浏览提供给我的笔记时,它说应该是初步的猜测,但我不知道这将是什么。
对于(b)部分,我不确定如何得到残差,我想我可以遍历从fitter方法返回的“最佳”数组,分别计算b(m)值并减去,但是我不确定问题的措辞。
谢谢您的帮助。
TypeError Traceback (most recent call last)
<ipython-input-2-30fd8d6062a3> in <module>
27 plota(xvalues, fitt)
28 plota(xvalues, fitter(xvalues, fitt, arr))
---> 29 main()
30
<ipython-input-2-30fd8d6062a3> in main()
26 return
27 plota(xvalues, fitt)
---> 28 plota(xvalues, fitter(xvalues, fitt, arr))
29 main()
30
<ipython-input-1-ac8e97799a28> in fitter(xval, yval, initial)
22 are returned to be utilized in a main script.
23 '''
---> 24 best, _ = curve_fit(fitfunc, xval, yval, p0=initial)
25 return best
26
C:\ProgramData\Anaconda3\lib\site-packages\scipy\optimize\minpack.py in curve_fit(f, xdata, ydata, p0, sigma, absolute_sigma, check_finite, bounds, method, jac, **kwargs)
750 # Remove full_output from kwargs, otherwise we're passing it in twice.
751 return_full = kwargs.pop('full_output', False)
--> 752 res = leastsq(func, p0, Dfun=jac, full_output=1, **kwargs)
753 popt, pcov, infodict, errmsg, ier = res
754 cost = np.sum(infodict['fvec'] ** 2)
C:\ProgramData\Anaconda3\lib\site-packages\scipy\optimize\minpack.py in leastsq(func, x0, args, Dfun, full_output, col_deriv, ftol, xtol, gtol, maxfev, epsfcn, factor, diag)
381 if not isinstance(args, tuple):
382 args = (args,)
--> 383 shape, dtype = _check_func('leastsq', 'func', func, x0, args, n)
384 m = shape[0]
385
C:\ProgramData\Anaconda3\lib\site-packages\scipy\optimize\minpack.py in _check_func(checker, argname, thefunc, x0, args, numinputs, output_shape)
24 def _check_func(checker, argname, thefunc, x0, args, numinputs,
25 output_shape=None):
---> 26 res = atleast_1d(thefunc(*((x0[:numinputs],) + args)))
27 if (output_shape is not None) and (shape(res) != output_shape):
28 if (output_shape[0] != 1):
C:\ProgramData\Anaconda3\lib\site-packages\scipy\optimize\minpack.py in func_wrapped(params)
456 if transform is None:
457 def func_wrapped(params):
--> 458 return func(xdata, *params) - ydata
459 elif transform.ndim == 1:
460 def func_wrapped(params):
TypeError: fitfunc() missing 6 required positional arguments: 'mu', 'sigma', 'R', 'A', 'b1', and 'b2'
答案 0 :(得分:0)
我修改了您的代码以使其运行,因此您的init数组在这里对我来说已更改。
"""."""
# YOUR CODE HERE
import math
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def fitfunc(m, mu, sigma, R, A, b1, b2):
"""."""
tb1 = b1 * (m - 105.5)
tb2 = b2 * ((m-105.5)**2)
b = A * np.exp(tb1 + tb2)
ts1 = R / (sigma * np.sqrt(2 * np.pi))
ts2 = -(((m - mu)**2) / (2 * (sigma**2)))
s = ts1 * np.exp(ts2)
tot = b + s
return tot
def fitter(xval, yval, initial):
"""
Function to fit the given data using a 'fitfunc' TBD.
The curve_fit function is called. Only the best fit values
are returned to be utilized in a main script.
"""
best, _ = curve_fit(fitfunc, xval, yval, p0=initial)
return best
# Use functions with script below for plotting parts (a) and (b)
# start value parameter definitions, see equations for s(m) and b(m).
# init[0] = mu
# init[1] = sigma
# init[2] = R
# init[3] = A
# init[4] = b1
# init[5] = b2
init = (126, 2, 470, 5000, 1, 5)
xvalues = np.arange(start=105.5, stop=160.5, step=1)
data = np.array([4780, 4440, 4205, 4150, 3920, 3890, 3590, 3460, 3300, 3200, 3000,
2950, 2830, 2700, 2620, 2610, 2510, 2280, 2330, 2345, 2300, 2190,
2080, 1990, 1840, 1830, 1730, 1680, 1620, 1600, 1540, 1505, 1450,
1410, 1380, 1380, 1250, 1230, 1220, 1110, 1110, 1080, 1055, 1050,
940, 920, 950, 880, 870, 850, 800, 820, 810, 770, 760])
def main():
"""."""
arr = np.ndarray(init)
fitt = fitfunc(xvalues, init[0], init[1], init[2], init[3], init[4], init[5])
def plota(xval, yval):
fig = plt.figure()
axis1 = fig.add_axes([0.12, 0.1, 0.85, 0.85])
axis1.plot(xval, yval, marker="+", color="red")
axis1.set_title("Combined", size=12)
axis1.set_xlabel("Mass [GeV]", size=12)
plt.show()
return
plota(xvalues, fitt)
plota(xvalues, fitter(xvalues, fitt, arr))
main()
请注意,主缩进是通过1个标签/空格分组关闭的。
答案 1 :(得分:0)
我想你很亲近,但有两点:
b1
和b2
> 0的值可能导致指数的无限性curve_fit
的返回值是最佳参数值,而不是最佳拟合值。您必须自己计算。 您可能还想适合数据数组,对吗?我认为这可能就是您要寻找的
import math
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def fitfunc(m, mu, sigma, R, A, b1, b2):
"""comment about Higgs mass here"""
tb1 = b1 * (m - 105.5)
tb2 = b2 * ((m-105.5)**2)
b = A * np.exp(tb1 + tb2)
ts1 = R / (sigma * np.sqrt(2 * np.pi))
ts2 = -(((m - mu)**2) / (2 * (sigma**2)))
s = ts1 * np.exp(ts2)
tot = b + s
return tot
xvalues = np.arange(start=105.5, stop=160.5, step=1)
data = np.array([4780, 4440, 4205, 4150, 3920, 3890, 3590, 3460, 3300, 3200, 3000,
2950, 2830, 2700, 2620, 2610, 2510, 2280, 2330, 2345, 2300, 2190,
2080, 1990, 1840, 1830, 1730, 1680, 1620, 1600, 1540, 1505, 1450,
1410, 1380, 1380, 1250, 1230, 1220, 1110, 1110, 1080, 1055, 1050,
940, 920, 950, 880, 870, 850, 800, 820, 810, 770, 760])
# start value parameter definitions, see equations for s(m) and b(m).
# init[0] = mu
# init[1] = sigma
# init[2] = R
# init[3] = A
# init[4] = b1
# init[5] = b2
init = np.array([125.8, 2, 470, 5000., -0.05, -0.001])
init_fit = fitfunc(xvalues, *init)
best, _ = curve_fit(fitfunc, xvalues, data, p0=init)
print(best)
best_fit = fitfunc(xvalues, *best)
plt.plot(xvalues, data, color='red', marker='+', label='data')
plt.plot(xvalues, init_fit, color='black', label='init')
plt.plot(xvalues, best_fit, color='blue', label='fit')
plt.gca().set_title("Combined", size=12)
plt.gca().set_xlabel("Mass [GeV]", size=12)
plt.legend()
plt.show()
如果您允许的话,我也建议您使用lmfit
(http://lmfit.github.io/lmfit-py/)(披露:我是作者之一)。使用该库,上面带有curve_fit
的代码将转换为
import lmfit
h_model = Model(fitfunc)
params = h_model.make_params(mu=125.8, sigma=2, R=470,
A=5000, b1=-0.05, b2=-0.001)
result = h_model.fit(data, params, m=xvalues)
print(result.fit_report())
plt.plot(xvalues, data, color='red', marker='+', label='data')
plt.plot(xvalues, result.init_fit, color='black', label='init')
plt.plot(xvalues, result.best_fit, color='blue', label='fit')
plt.gca().set_title("Combined", size=12)
plt.gca().set_xlabel("Mass [GeV]", size=12)
plt.legend()
plt.show()
请注意,在使用lmfit的情况下,将使用函数自变量命名。在lmfit
中,所有参数都可以有边界,因此您可以执行类似的操作
params['b1'].max = 0.0
确保b1
保持负数您还可以固定任何参数值。还有许多其他功能。
此拟合的打印报告将包括不确定性和相关性的估计以及拟合统计:
[[Model]]
Model(fitfunc)
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 100
# data points = 55
# variables = 6
chi-square = 106329.424
reduced chi-square = 2169.98824
Akaike info crit = 428.183028
Bayesian info crit = 440.227027
[[Variables]]
mu: 125.940465 +/- 0.34609625 (0.27%) (init = 125.8)
sigma: 1.52638256 +/- 0.37354633 (24.47%) (init = 2)
R: 677.016219 +/- 163.585050 (24.16%) (init = 470)
A: 4660.71073 +/- 24.3437093 (0.52%) (init = 5000)
b1: -0.04279037 +/- 7.7658e-04 (1.81%) (init = -0.05)
b2: 1.7476e-04 +/- 1.7587e-05 (10.06%) (init = -0.001)
[[Correlations]] (unreported correlations are < 0.100)
C(b1, b2) = -0.952
C(A, b1) = -0.775
C(sigma, R) = 0.655
C(A, b2) = 0.650
C(R, b1) = -0.492
C(R, b2) = 0.445
C(sigma, b1) = -0.317
C(sigma, b2) = 0.287
C(R, A) = 0.230
C(sigma, A) = 0.146