Temp k(T)
298 6.66E-63
300 1.48E-62
350 3.58E-55
400 1.25E-49
450 2.57E-45
500 7.30E-42
550 4.90E-39
600 1.12E-36
650 1.11E-34
700 5.72E-33
750 1.75E-31
800 3.49E-30
850 4.92E-29
900 5.17E-28
950 4.24E-25
1000 2.83E-26
以上是给定的动力学数据,我试图拟合这些数据并绘制相同的图表。
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
import pandas as pd
plt.style.use('ggplot')
#Generate data
df=pd.read_excel('py_curvefit.xlsx')
T=df.Temp #xdata
def reacKine(T,A,n,Ea):
return A*((T/298)**n)*np.exp(-Ea/(0.008314*T))
kt=df['k(T)'] #ydata
#rectifying an erroneous value
kt[14]=4.24*10**(-27)
popt,pcov=curve_fit(reacKine,T,kt)
A,n,Ea=popt
plt.plot(T,np.log(kt),'g-',label='given data')
plt.plot(T,np.log(reacKine(T,*popt)),'ro',label='fit')
plt.xlabel('Temperature [K]')
plt.ylabel('log of reaction coefficient')
plt.legend(loc='best')
plt.show()
它表示找不到该功能的最佳参数。我该如何纠正这个问题。我希望看到一个合适的人选。是因为指数期限吗?
答案 0 :(得分:3)
这是一个敏感问题(通常涉及指数时)。对于这样的问题,重要的是对参数进行非常好的初始猜测。
如果您试验参数,您会发现A
必须非常小。 curve_fit
用于所有参数的默认初始猜测为1,而1对于A
来说太大了。如果我使用1e-10作为A
popt, pcov = curve_fit(reacKine, T, kt, p0=(1e-10, 1, 1))
我从curve_fit
收到以下错误:
RuntimeError: Optimal parameters not found: Number of calls to function has reached maxfev = 800.
因此,让我们maxfev
增加2000
:
popt, pcov = curve_fit(reacKine, T, kt, p0=(1e-10, 1, 1), maxfev=2000)
我得到了同样的错误。当我将其增加到100000
时,函数成功。
这是一个脚本,其中包含对curve_fit
的更新调用,后跟脚本生成的图。
import numpy as np
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt
T = np.array([298, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800,
850, 900, 950, 1000])
kt = np.array([6.66e-63, 1.48e-62, 3.58e-55, 1.25e-49, 2.57e-45, 7.30e-42,
4.90e-39, 1.12e-36, 1.11e-34, 5.72e-33, 1.75e-31, 3.49e-30,
4.92e-29, 5.17e-28, 4.24e-27, 2.83e-26])
def reacKine(T,A,n,Ea):
return A*((T/298)**n)*np.exp(-Ea/(0.008314*T))
popt, pcov = curve_fit(reacKine, T, kt, p0=(1e-10, 1, 1), maxfev=100000)
plt.plot(T, kt, '.', label='data')
tt = np.linspace(T[0], T[-1], 160)
kk = reacKine(tt, *popt)
semilogy = True
if semilogy:
plt.semilogy(tt, kk, 'k-', alpha=0.3, label='fit')
results_xy = (700, 1e-45)
else:
plt.plot(tt, kk, 'k-', alpha=0.3, label='fit')
results_xy = (300, 1.5e-26)
plt.annotate(xy=results_xy,
s=('Fit Results:\n $A\,$ = %.4g\n $n\,$ = %.4g\n $E_{a}$ = %.4g' %
tuple(popt)))
plt.xlabel('T')
plt.ylabel('k(T)')
plt.legend(framealpha=1, shadow=True)
plt.show()
P.S。 @MNewville可能会建议使用lmfit建议更好的方法。
答案 1 :(得分:2)
我使用下面的代码和pyeq3拟合库得到以下参数和拟合统计数据:
Fitting target of sum of squared absolute error = 7.93711173898e-62
Fitted Parameters:
A = 3.6814349968228987E-12
Ea = 2.8663497636217801E+02
n = 1.6329619761384757E+00
Degress of freedom error 13
Degress of freedom regression 2
Root Mean Squared Error (RMSE): 7.04322002841e-32
R-squared: 0.9999999999
R-squared adjusted: 0.999999999884
Model F-statistic: 64790385432.5
Model F-statistic p-value: 1.11022302463e-16
Model log-likelihood: 1124.98750379
Model AIC: -140.248437973
Model BIC: -140.103577588
Individual Parameter Statistics:
Coefficient A = 3.6814349968228987E-12
std error: 1.67464E-25
t-stat: 8.99615E+00
p-stat: 6.05074E-07
95 percent confidence intervals: [2.79736E-12, 4.56551E-12]
Coefficient Ea = 2.8663497636217801E+02
std error: 1.69556E-01
t-stat: 6.96102E+02
p-stat: 0.00000E+00
95 percent confidence intervals: [2.85745E+02, 2.87525E+02]
Coefficient n = 1.6329619761384757E+00
std error: 2.59159E-03
t-stat: 3.20770E+01
p-stat: 9.19265E-14
95 percent confidence intervals: [1.52298E+00, 1.74294E+00]
Coefficient Covariance Matrix:
[ 2.74285036e+37 2.75990923e+49 -3.41210380e+48]
[ 2.75990923e+49 2.77711499e+61 -3.43328442e+60]
[-3.41210380e+48 -3.43328442e+60 4.24469499e+59]
import os, sys, inspect
import pyeq3
functionString = 'A*((X/298)**n)*exp(-Ea/(0.008314*X))'
data = '''
298 6.66e-63
300 1.48e-62
350 3.58e-55
400 1.25e-49
450 2.57e-45
500 7.30e-42
550 4.90e-39
600 1.12e-36
650 1.11e-34
700 5.72e-33
750 1.75e-31
800 3.49e-30
850 4.92e-29
900 5.17e-28
950 4.24e-27
1000 2.83e-26
'''
# note that the constructor is passed the function string here
equation = pyeq3.Models_2D.UserDefinedFunction.UserDefinedFunction(inUserFunctionString = functionString)
pyeq3.dataConvertorService().ConvertAndSortColumnarASCII(data, equation, False)
equation.Solve()
##########################################################
print("Equation:", equation.GetDisplayName(), str(equation.GetDimensionality()) + "D")
print("Fitting target of", equation.fittingTargetDictionary[equation.fittingTarget], '=', equation.CalculateAllDataFittingTarget(equation.solvedCoefficients))
print("Fitted Parameters:")
for i in range(len(equation.solvedCoefficients)):
print(" %s = %-.16E" % (equation.GetCoefficientDesignators()[i], equation.solvedCoefficients[i]))
equation.CalculateModelErrors(equation.solvedCoefficients, equation.dataCache.allDataCacheDictionary)
print()
##########################################################
equation.CalculateCoefficientAndFitStatistics()
if equation.upperCoefficientBounds or equation.lowerCoefficientBounds:
print('You entered coefficient bounds. Parameter statistics may')
print('not be valid for parameter values at or near the bounds.')
print()
print('Degress of freedom error', equation.df_e)
print('Degress of freedom regression', equation.df_r)
if equation.rmse == None:
print('Root Mean Squared Error (RMSE): n/a')
else:
print('Root Mean Squared Error (RMSE):', equation.rmse)
if equation.r2 == None:
print('R-squared: n/a')
else:
print('R-squared:', equation.r2)
if equation.r2adj == None:
print('R-squared adjusted: n/a')
else:
print('R-squared adjusted:', equation.r2adj)
if equation.Fstat == None:
print('Model F-statistic: n/a')
else:
print('Model F-statistic:', equation.Fstat)
if equation.Fpv == None:
print('Model F-statistic p-value: n/a')
else:
print('Model F-statistic p-value:', equation.Fpv)
if equation.ll == None:
print('Model log-likelihood: n/a')
else:
print('Model log-likelihood:', equation.ll)
if equation.aic == None:
print('Model AIC: n/a')
else:
print('Model AIC:', equation.aic)
if equation.bic == None:
print('Model BIC: n/a')
else:
print('Model BIC:', equation.bic)
print()
print("Individual Parameter Statistics:")
for i in range(len(equation.solvedCoefficients)):
if type(equation.tstat_beta) == type(None):
tstat = 'n/a'
else:
tstat = '%-.5E' % ( equation.tstat_beta[i])
if type(equation.pstat_beta) == type(None):
pstat = 'n/a'
else:
pstat = '%-.5E' % ( equation.pstat_beta[i])
if type(equation.sd_beta) != type(None):
print("Coefficient %s = %-.16E, std error: %-.5E" % (equation.GetCoefficientDesignators()[i], equation.solvedCoefficients[i], equation.sd_beta[i]))
else:
print("Coefficient %s = %-.16E, std error: n/a" % (equation.GetCoefficientDesignators()[i], equation.solvedCoefficients[i]))
print(" t-stat: %s, p-stat: %s, 95 percent confidence intervals: [%-.5E, %-.5E]" % (tstat, pstat, equation.ci[i][0], equation.ci[i][1]))
print()
print("Coefficient Covariance Matrix:")
for i in equation.cov_beta:
print(i)
答案 2 :(得分:1)
(强制?)lmfit回答:
您可能会发现lmfit很有用。对于这个问题的框架方式,它不会增加太多,但为曲线拟合和拟合参数提供了更好的抽象。与@ WarrenWeskesser的答案类似,它看起来像
import numpy as np
import matplotlib.pyplot as plt
from lmfit import Model
T = np.array([298, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800,
850, 900, 950, 1000])
kt = np.array([6.66e-63, 1.48e-62, 3.58e-55, 1.25e-49, 2.57e-45, 7.30e-42,
4.90e-39, 1.12e-36, 1.11e-34, 5.72e-33, 1.75e-31, 3.49e-30,
4.92e-29, 5.17e-28, 4.24e-27, 2.83e-26])
def reacKine(T, A, n, Ea):
return A*((T/298)**n)*np.exp(-Ea/(0.008314*T))
react_model = Model(reacKine)
params = react_model.make_params(A=2.e-11, n=1, Ea=200)
result = react_model.fit(kt, params, T=T)
print(result.fit_report())
plt.plot(T, kt, 'bo', label='data')
plt.plot(T, result.best_fit, 'r--', label='fit')
plt.xlabel('T (K)')
plt.ylabel('k(T)')
plt.legend()
plt.gca().set_yscale('log')
plt.show()
当使用Python27和scipy 1.0.0时,这将显示类似于Warrens的拟合(我遗漏了注释),并打印出适合的报告
[[Model]]
Model(reacKine)
[[Fit Statistics]]
# function evals = 1294
# data points = 16
# variables = 3
chi-square = 0.000
reduced chi-square = 0.000
Akaike info crit = -2219.907
Bayesian info crit = -2217.590
[[Variables]]
A: 1.3365e-10 +/- 5.06e-12 (3.79%) (init= 2e-11)
n: -0.02392420 +/- 0.034279 (143.28%) (init= 1)
Ea: 299.843529 +/- 0.024996 (0.01%) (init= 200)
[[Correlations]] (unreported correlations are < 0.100)
C(A, n) = -0.997
C(A, Ea) = 0.117
当适合Python36和scipy 1.0.0时,报告将是
[[Model]]
Model(reacKine)
[[Fit Statistics]]
# function evals = 1618
# data points = 16
# variables = 3
chi-square = 0.000
reduced chi-square = 0.000
Akaike info crit = -2289.381
Bayesian info crit = -2287.063
[[Variables]]
A: 3.6814e-12 +/- 4.09e-13 (11.12%) (init= 2e-11)
n: 1.63296239 +/- 0.050923 (3.12%) (init= 1)
Ea: 286.634973 +/- 0.411890 (0.14%) (init= 200)
[[Correlations]] (unreported correlations are < 0.100)
C(A, n) = -1.000
C(A, Ea) = 1.000
C(n, Ea) = -1.000
这些价值观与沃伦和詹姆斯所展示的一致。
我没有很好的解释为什么结果与Python版本不同,特别是为什么相关性是&gt;适用于Python36版本中所有变量的0.999。但是,由于参数几乎完全相关,并且与数据点相比有很多拟合评估,如果存在错误的最小值和复杂的相关空间,我不会感到惊讶。