不幸的是,与scipy
相匹配的力量并不合适。我尝试使用p0
作为输入参数,其值接近但没有帮助。
如果有人能指出我的问题,我会很高兴。
# Imports
from scipy.optimize import curve_fit
import numpy as np
import matplotlib.pyplot as plt
# Data
data = [[0.004408724185371062, 78.78011887652593], [0.005507091456466967, 65.01330508350753], [0.007073553026306459, 58.13364205119446], [0.009417452253958304, 50.12258366028477], [0.01315330108197482, 44.22980301062208], [0.019648758406406834, 35.436139354228956], [0.03248060063099905, 28.359815190205957], [0.06366197723675814, 21.54769216720596], [0.17683882565766149, 14.532777174472574], [1.5915494309189533, 6.156872080264581]]
# Fill lists to store x and y value
x_data,y_data = [], []
for i in data:
x_data.append(i[0])
y_data.append(i[1])
# Exponential Function
def func(x,m,c):
return x**m * c
# Curve fit
coeff, _ = curve_fit(func, x_data, y_data)
m, c = coeff[0], coeff[1]
# Plot function
x_function = np.linspace(0, 1.5, 100)
y = x_function**m * c
a = plt.scatter(x_data, y_data, s=30, marker = "v")
yfunction = x_function**m * c
plt.plot(x_function, yfunction, '-')
plt.show()
拟合非常糟糕的另一个数据集是:
data = [[0.004408724185371062, 194.04075083542443], [0.005507091456466967, 146.09194314074864], [0.007073553026306459, 120.2115882821158], [0.009417452253958304, 74.04014371874908], [0.01315330108197482, 34.167114633194736], [0.019648758406406834, 12.775528348369871], [0.03248060063099905, 7.903195816871708], [0.06366197723675814, 5.186092050500438], [0.17683882565766149, 3.260540592404184], [1.5915494309189533, 2.006254812978579]]
答案 0 :(得分:1)
我可能会错过一些东西,但我认为curve_fit工作正常。当我将curve_fit获得的残差与使用你在评论中提供的excel获得的参数得到的残差进行比较时,python结果总是导致较低的残差(下面提供代码)。你说“不幸的是,适合scipy的力量并不能很好地适应。”但你对“合适”的衡量标准到底是什么?对于残差,python拟合似乎总是优于excel。
不确定它是否必须完全是这个功能,但如果没有,你也可以考虑在你的函数中添加第三个参数(在它下面命名为“d”),这将导致更好的结果。
这是修改后的代码。我改变了你的“功能”,也增加了情节的分辨率。然后也打印残差。对于第一个数据集,一个在79.35附近获得excel,在34.29附近获得python。对于第二个数据集,它是15220.79与excel和601.08与python(假设我没有弄乱任何东西)。
from scipy.optimize import curve_fit
import numpy as np
import matplotlib.pyplot as plt
# Data
data = [[0.004408724185371062, 78.78011887652593], [0.005507091456466967, 65.01330508350753], [0.007073553026306459, 58.13364205119446], [0.009417452253958304, 50.12258366028477], [0.01315330108197482, 44.22980301062208], [0.019648758406406834, 35.436139354228956], [0.03248060063099905, 28.359815190205957], [0.06366197723675814, 21.54769216720596], [0.17683882565766149, 14.532777174472574], [1.5915494309189533, 6.156872080264581]]
#data = [[0.004408724185371062, 194.04075083542443], [0.005507091456466967, 146.09194314074864], [0.007073553026306459, 120.2115882821158], [0.009417452253958304, 74.04014371874908], [0.01315330108197482, 34.167114633194736], [0.019648758406406834, 12.775528348369871], [0.03248060063099905, 7.903195816871708], [0.06366197723675814, 5.186092050500438], [0.17683882565766149, 3.260540592404184], [1.5915494309189533, 2.006254812978579]]
# Fill lists to store x and y value
x_data,y_data = [], []
for i in data:
x_data.append(i[0])
y_data.append(i[1])
# Exponential Function
def func(x,m,c):
#slightly rewritten; you could also consider using a third parameter d
return c*np.power(x,m) # + d
# Curve fit
coeff, _ = curve_fit(func, x_data, y_data)
m, c = coeff[0], coeff[1] #, coeff[2]
print m, c #, d
# Plot function
a = plt.scatter(x_data, y_data, s=30, marker = "v")
x_function = np.linspace(0, 1.5, 1000)
yfunction = c*np.power(x_function,m) # + d
plt.plot(x_function, yfunction, '-')
plt.show()
print "residuals python:",((y_data - func(x_data, *coeff))**2).sum()
#compare to excel, first data set
print "residuals excel:",((y_data - func(x_data, -0.425,7.027))**2).sum()
#compare to excel, second data set
print "residuals excel:",((y_data - func(x_data, -0.841,1.0823))**2).sum()
答案 1 :(得分:0)
以您的第二个数据集为例:如果您绘制原始数据,则数据的难度变得明显:您的数据非常不均匀。现在,由于您的函数具有纯幂律形式,因此最简单的是以对数标度进行拟合:
In [1]: import numpy as np
In [2]: import matplotlib.pyplot as plt
In [3]: plt.ion()
In [4]: data = [[0.004408724185371062, 194.04075083542443], [0.005507091456466967, 146.09194314074864], [0.007073553026306459, 120.2115882821158], [0.009417452253958304, 74.04014371874908], [0.01315330108197482, 34.167114633194736], [0.019648758406406834, 12.775528348369871], [0.03248060063099905, 7.903195816871708], [0.06366197723675814, 5.186092050500438], [0.17683882565766149, 3.260540592404184], [1.5915494309189533, 2.006254812978579]]
In [5]: data = np.asarray(data) # just for convenience
In [6]: data.shape
Out[6]: (10, 2)
In [7]: x, y = data[:, 0], data[:, 1]
In [8]: lx, ly = np.log(x), np.log(y)
In [9]: plt.plot(lx, ly, 'ro')
Out[9]: [<matplotlib.lines.Line2D at 0x323a250>]
In [10]: def lfunc(x, a, b):
....: return a*x + b
....:
In [11]: from scipy.optimize import curve_fit
In [12]: opt, cov = curve_fit(lfunc, lx, ly)
In [13]: opt
Out[13]: array([-0.84071518, 0.07906558])
In [14]: plt.plot(lx, lfunc(lx, *opt), 'b-')
Out[14]: [<matplotlib.lines.Line2D at 0x3be0f90>]
这是否是适合数据的模型是一个单独的问题。