将指数和5高斯拟合到python中的数据

时间:2015-03-27 12:36:41

标签: python numpy scipy scikit-learn gaussian

我正在尝试将指数函数和5个高斯拟合到我的数据中。我的目标是沿着这些方向:(其中gDNA Fit是指数; 1-5Nuc Fit是5高斯;总拟合是所有拟合的总和)

enter image description here

我接近它的方式是拟合指数,然后在此基础上引入一个截止点,这将允许我适应高斯人而不考虑已经拟合的数据。 (我已经将数据减少到100,因为这是它下降到0的位置)

问题是我似乎无法正确地适应指数而且高斯人的规模不合适:

from scipy.optimize import curve_fit
from pylab import *
import matplotlib.pyplot

#Exponential

x = np.array([1.010000000000000000e+02,1.100000000000000000e+02,1.190000000000000000e+02,1.280000000000000000e+02,1.370000000000000000e+02,1.460000000000000000e+02,1.550000000000000000e+02,1.640000000000000000e+02,1.730000000000000000e+02,1.820000000000000000e+02,1.910000000000000000e+02,2.000000000000000000e+02,2.090000000000000000e+02,2.180000000000000000e+02,2.270000000000000000e+02,2.360000000000000000e+02,2.450000000000000000e+02,2.540000000000000000e+02,2.630000000000000000e+02,2.720000000000000000e+02,2.810000000000000000e+02,2.900000000000000000e+02,2.990000000000000000e+02,3.080000000000000000e+02,3.170000000000000000e+02,3.260000000000000000e+02,3.350000000000000000e+02,3.440000000000000000e+02,3.530000000000000000e+02,3.620000000000000000e+02,3.710000000000000000e+02,3.800000000000000000e+02,3.890000000000000000e+02,3.980000000000000000e+02,4.070000000000000000e+02,4.160000000000000000e+02,4.250000000000000000e+02,4.340000000000000000e+02,4.430000000000000000e+02,4.520000000000000000e+02,4.610000000000000000e+02,4.700000000000000000e+02,4.790000000000000000e+02,4.880000000000000000e+02,4.970000000000000000e+02,5.060000000000000000e+02,5.150000000000000000e+02,5.240000000000000000e+02,5.330000000000000000e+02,5.420000000000000000e+02,5.510000000000000000e+02,5.600000000000000000e+02,5.690000000000000000e+02,5.780000000000000000e+02,5.870000000000000000e+02,5.960000000000000000e+02,6.050000000000000000e+02,6.140000000000000000e+02,6.230000000000000000e+02,6.320000000000000000e+02,6.410000000000000000e+02,6.500000000000000000e+02,6.590000000000000000e+02,6.680000000000000000e+02,6.770000000000000000e+02,6.860000000000000000e+02,6.950000000000000000e+02,7.040000000000000000e+02,7.130000000000000000e+02,7.220000000000000000e+02,7.310000000000000000e+02,7.400000000000000000e+02,7.490000000000000000e+02,7.580000000000000000e+02,7.670000000000000000e+02,7.760000000000000000e+02,7.850000000000000000e+02,7.940000000000000000e+02,8.030000000000000000e+02,8.120000000000000000e+02,8.210000000000000000e+02,8.300000000000000000e+02,8.390000000000000000e+02,8.480000000000000000e+02,8.570000000000000000e+02,8.660000000000000000e+02,8.750000000000000000e+02,8.840000000000000000e+02,8.930000000000000000e+02,9.020000000000000000e+02,9.110000000000000000e+02,9.200000000000000000e+02,9.290000000000000000e+02,9.380000000000000000e+02,9.470000000000000000e+02,9.560000000000000000e+02,9.650000000000000000e+02,9.740000000000000000e+02,9.830000000000000000e+02,9.920000000000000000e+02])
y = np.array([3.579280000000000000e+05,3.172290000000000000e+05,1.759610000000000000e+05,1.352610000000000000e+05,1.069130000000000000e+05,9.721000000000000000e+04,9.908200000000000000e+04,1.168480000000000000e+05,1.266880000000000000e+05,1.264760000000000000e+05,1.279850000000000000e+05,1.198880000000000000e+05,1.117730000000000000e+05,1.005850000000000000e+05,9.038500000000000000e+04,7.532400000000000000e+04,6.235500000000000000e+04,5.249600000000000000e+04,4.445600000000000000e+04,3.808000000000000000e+04,3.612100000000000000e+04,3.460600000000000000e+04,3.209700000000000000e+04,3.008200000000000000e+04,3.090700000000000000e+04,3.208600000000000000e+04,2.949700000000000000e+04,3.111600000000000000e+04,3.125700000000000000e+04,3.152700000000000000e+04,3.198700000000000000e+04,3.373800000000000000e+04,3.171200000000000000e+04,3.124900000000000000e+04,3.109700000000000000e+04,3.002200000000000000e+04,2.720100000000000000e+04,2.413600000000000000e+04,1.873100000000000000e+04,1.768900000000000000e+04,1.510600000000000000e+04,1.358800000000000000e+04,1.354400000000000000e+04,1.198900000000000000e+04,1.182800000000000000e+04,6.926000000000000000e+03,1.230000000000000000e+04,3.734000000000000000e+03,6.631000000000000000e+03,7.085000000000000000e+03,7.151000000000000000e+03,7.195000000000000000e+03,7.265000000000000000e+03,6.966000000000000000e+03,6.823000000000000000e+03,6.357000000000000000e+03,5.977000000000000000e+03,5.464000000000000000e+03,4.941000000000000000e+03,4.543000000000000000e+03,3.992000000000000000e+03,3.593000000000000000e+03,3.156000000000000000e+03,2.955000000000000000e+03,2.740000000000000000e+03,2.701000000000000000e+03,2.528000000000000000e+03,2.481000000000000000e+03,2.527000000000000000e+03,2.476000000000000000e+03,2.456000000000000000e+03,2.461000000000000000e+03,2.420000000000000000e+03,2.346000000000000000e+03,2.326000000000000000e+03,2.278000000000000000e+03,2.108000000000000000e+03,1.893000000000000000e+03,1.771000000000000000e+03,1.654000000000000000e+03,1.547000000000000000e+03,1.389000000000000000e+03,1.325000000000000000e+03,1.130000000000000000e+03,1.057000000000000000e+03,9.460000000000000000e+02,9.790000000000000000e+02,8.990000000000000000e+02,8.460000000000000000e+02,8.360000000000000000e+02,8.040000000000000000e+02,8.330000000000000000e+02,7.690000000000000000e+02,7.020000000000000000e+02,7.360000000000000000e+02,6.390000000000000000e+02,6.690000000000000000e+02,6.770000000000000000e+02,6.100000000000000000e+02,5.700000000000000000e+02])

def func(x, a, c, d):
    return a*np.exp(-c*x)+d


#print np.exp(-x)

popt, pcov = curve_fit(func, x, y, p0=(1, 0.01, 1))

yy = func(x, *popt)

matplotlib.pyplot.plot(x, y, 'ko')
matplotlib.pyplot.plot(x, yy)

#gaussian

from sklearn import mixture
import scipy

gmm = mixture.GMM(n_components=5, covariance_type='full')
gmm.fit(y)
pdfs = [p * scipy.stats.norm.pdf(x, mu, sd) for mu, sd, p in zip(gmm.means_, (gmm.covars_)**2, gmm.weights_)]


density = np.sum(np.array(pdfs), axis=0)
#print density
matplotlib.pyplot.plot(x, density)

show()

1 个答案:

答案 0 :(得分:1)

如果您不介意使用最小二乘而不是最大可能性,我建议立即拟合整个模型,包括指数与... scipy curve_fit。如果忽略高斯峰的存在,你永远不会很好地适应指数。我建议使用peak-o-mat(http://lorentz.sf.net),这是一个用python编写的交互式曲线拟合软件。几秒钟内你就可以得到这样的结果:

multi peak fit example