我有数据,我希望与两个高斯人配合,同时保持一个全球平均值。我已经使用scipy,lmfit,numpy库编写了Python程序。这是我已经拟合的数据结果(最小二乘):
mean1 sd1 A1 mean2 sd2 A2 y0
12.24 10.20 27526 25.50 20.42 30642 499.93
21.43 10.20 27529 25.51 20.39 30616 500.32
25.51 20.40 30599 30.61 10.21 27552 500.16
39.80 10.20 27536 25.52 20.42 30636 499.85
25.51 20.41 30616 48.98 10.21 27559 499.94
我的计算功能:
y0 + + sqrt(2/PI)*A1/w1*exp(-2*(x-xc1)^2/w1^2) + sqrt(2/PI)*A2/w2*exp(-2*(x-xc2)^2/w2^2)
Sorry, I dont know how to change into normal math formula.
这是一个测试,所以正确答案必须是:
mean1 sd1 A1 mean2 sd2 A2 y0
1 12 10 27000 25 20 30000 500
2 21 10 27000 25 20 30000 500
3 30 10 27000 25 20 30000 500
4 39 10 27000 25 20 30000 500
5 48 10 27000 25 20 30000 500
如您所见,它适合独立装配。问题是我的书面拟合程序有时会交换第一高斯和第二高斯参数值"这意味着现在如果我尝试为每个数据集设置mean2固定,它将会出错,因为第3和第5个数据集被交换,因此mean2将不正确(但我不确定)(对于这个例子,mean2总是必须是25)。这个问题在实际数据中更加严苛。 基本上,正如我所理解的那样,因为我的函数是f = y + gauss1 + gauss2,并且两个Gausses都是相同的,所以它看起来没有任何区别来拟合gauss1或gauss2,有时会把它混合起来。
输出全局适合度:
mean1 sd1 A1 mean2 sd2 A2 y0
12.28 10.31 28483 25.90 19.77 29169 508.60
21.42 10.42 29148 25.90 20.51 28746 505.21
30.61 9.99 26045 25.90 20.26 32149 499.46
39.84 10.11 26605 25.90 21.44 33000 475.15
48.87 9.49 25000 25.90 23.00 33000 485.45
尝试的实验数据(dab seperated):
321 759 568 567 567 567
322 877 587 585 585 585
323 1033 610 606 606 606
324 1231 639 632 632 632
325 1471 675 662 662 662
326 1745 721 697 697 697
327 2043 780 737 737 737
328 2346 855 782 782 782
329 2632 954 833 833 833
330 2877 1080 889 889 889
331 3061 1241 951 949 949
332 3168 1440 1017 1014 1014
333 3194 1682 1089 1083 1083
334 3142 1962 1166 1154 1154
335 3025 2275 1250 1226 1226
336 2863 2605 1341 1298 1298
337 2676 2933 1442 1369 1369
338 2485 3236 1558 1437 1437
339 2308 3488 1691 1500 1500
340 2155 3668 1848 1558 1556
341 2031 3759 2031 1608 1605
342 1936 3756 2243 1651 1644
343 1865 3662 2482 1686 1673
344 1812 3490 2739 1715 1691
345 1770 3261 3003 1740 1697
346 1734 2997 3255 1764 1691
347 1697 2722 3473 1794 1673
348 1657 2453 3633 1836 1645
349 1611 2204 3716 1896 1606
350 1560 1983 3710 1983 1560
351 1501 1791 3611 2099 1506
352 1437 1628 3425 2245 1450
353 1369 1490 3168 2418 1393
354 1298 1372 2863 2605 1341
355 1226 1269 2533 2790 1299
356 1154 1177 2202 2953 1274
357 1083 1095 1891 3071 1274
358 1014 10211613 3126 1306
359 949 952 1376 3103 1376
360 889 890 1180 3000 1488
361 833 833 1024 2821 1641
362 782 782 903 2582 1831
363 737 737 810 2301 2043
364 697 697 740 2003 2261
365 662 662 686 1711 2461
366 632 632 645 1440 2621
367 606 606 613 1205 2718
368 585 585 588 1011 2739
369 567 567 569 859 2679
我的脚本(取消注释全局适合上述部分):
import numpy as np
import matplotlib.pyplot as plt
from lmfit import minimize, Parameters, report_fit
# python 3.3
# Unofficial Windows Binaries for Python Extension Packages
# http://www.lfd.uci.edu/~gohlke/pythonlibs/
# VARIABLES
show_plot = 1
size_cols = 11
size_rows = 50
nm_start = 320
data_sets = 5
file_name = "5_testas.txt"
intens = [[[0] for i in range(size_cols)] for j in range(size_rows)]
with open(file_name) as f:
for row in range (0, size_rows):
datal = f.readline();
data = datal.split();
col = 0;
for datab in data:
intens[row][col] = datab;
col = col+1;
#def gauss(x, amp, cen, sigma):
# "basic gaussian"
def gauss(x, mean, sd, A):
"basic gaussian"
return np.sqrt(2/np.pi)*A/sd*np.exp(-2*np.power(((x-mean)/sd), 2))
def gauss_dataset(params, i, x):
"""calc gaussian from params for data set i
using simple, hardwired naming convention"""
mean1 = params['mean1_%i' % (i+1)].value
sd1 = params['sd1_%i' % (i+1)].value
A1 = params['A1_%i' % (i+1)].value
mean2 = params['mean2_%i' % (i+1)].value
sd2 = params['sd2_%i' % (i+1)].value
A2 = params['A2_%i' % (i+1)].value
y0 = params['y0_%i' % (i+1)].value
return y0 + gauss(x, mean1, sd1, A1) + gauss(x, mean2, sd2, A2)
def gauss_dataset_a(params, i, x):
"""calc gaussian from params for data set i
using simple, hardwired naming convention"""
mean1 = params['mean1_%i' % (i+1)].value
sd1 = params['sd1_%i' % (i+1)].value
A1 = params['A1_%i' % (i+1)].value
mean2 = params['mean2_%i' % (i+1)].value
sd2 = params['sd2_%i' % (i+1)].value
A2 = params['A2_%i' % (i+1)].value
y0 = params['y0_%i' % (i+1)].value
return y0 + gauss(x, mean1, sd1, A1)
def gauss_dataset_b(params, i, x):
"""calc gaussian from params for data set i
using simple, hardwired naming convention"""
mean1 = params['mean1_%i' % (i+1)].value
sd1 = params['sd1_%i' % (i+1)].value
A1 = params['A1_%i' % (i+1)].value
mean2 = params['mean2_%i' % (i+1)].value
sd2 = params['sd2_%i' % (i+1)].value
A2 = params['A2_%i' % (i+1)].value
y0 = params['y0_%i' % (i+1)].value
return y0 + gauss(x, mean2, sd2, A2)
def objective(params, x, data):
""" calculate total residual for fits to several data sets held
in a 2-D array, and modeled by Gaussian functions"""
ndata, nx = data.shape
resid = 0.0*data[:]
# make residual per data set
for i in range(ndata):
resid[i, :] = data[i, :] - gauss_dataset(params, i, x)
# now flatten this to a 1D array, as minimize() needs
return resid.flatten()
x = np.linspace(0, 50, 50)
data = []
# dummy data
for i in np.arange(data_sets):
dat = gauss(x, 1, 1, 1)
data.append(dat)
# data has shape
data = np.array(data)
# Rearange data, exclude 1st set.
for col in range(0, data_sets):
for row in range (0, size_rows):
data[col][row] = intens[row][col+1]
# create 5 sets of parameters, one per data set
fit_params = Parameters()
for iy, y in enumerate(data):
fit_params.add( 'mean1_%i' % (iy+1), value=26.0, min=0.0, max=50.0)
fit_params.add( 'mean2_%i' % (iy+1), value=26.0, min=0.0, max=50.0)
fit_params.add( 'A1_%i' % (iy+1), value=28500.0, min=25000.0, max=33000.0)
fit_params.add( 'A2_%i' % (iy+1), value=28500.0, min=25000.0, max=33000.0)
fit_params.add( 'sd1_%i' % (iy+1), value=15.0, min=7.0, max=23.0)
fit_params.add( 'sd2_%i' % (iy+1), value=15.0, min=7.0, max=23.0)
fit_params.add( 'y0_%i' % (iy+1), value=1000.0, min=300.0, max=1500.0)
# UNCOMMENT FOR GLOBAL FIT
#for iy in range(2, data_sets+1):
#fit_params['mean2_%i' % iy].expr='mean2_1'
# run the global fit to all the data sets
minimize(objective, fit_params, args=(x, data))
# plot the data sets and fits
plt.figure()
print('mean1\tsd1\tA1\tmean2\tsd2\tA2\ty0')
for i in range(data_sets):
print("%0.2f" % fit_params['mean1_%i' % (i+1)].value+'\t'+"%0.2f" % fit_params['sd1_%i' % (i+1)].value+'\t'+"%0.0f" % fit_params['A1_%i' % (i+1)].value+'\t'+"%0.2f" % fit_params['mean2_%i' % (i+1)].value+'\t'+"%0.2f" % fit_params['sd2_%i' % (i+1)].value+'\t'+"%0.0f" % fit_params['A2_%i' % (i+1)].value+'\t'+"%0.2f" % fit_params['y0_%i' % (i+1)].value, end="\n")
if show_plot == 1:
for i in range(data_sets):
y_fit = gauss_dataset(fit_params, i, x)
y_fit_a = gauss_dataset_a(fit_params, i, x)
y_fit_b = gauss_dataset_b(fit_params, i, x)
plt.plot(x, data[i, :], 'o', x, y_fit, '-')
plt.plot(x, data[i, :], 'o', x, y_fit_a, '-')
plt.plot(x, data[i, :], 'o', x, y_fit_b, '-')
plt.show()
那么,我怎样才能改进我的代码呢? 全球适合真的包含错误的手段吗?因为它有点接近25.我没有工具来检查它。 此外,这是正常的,我的价值观有点"关"真实的。例如,我不认为mean2为25,每个数据集为~25.5。
答案 0 :(得分:0)
首先,这是您的数据图:
当您开始使用两条高斯曲线的相同参数时,很明显计算机不知道哪一个应该是数据中的哪一个。那么,你能做什么?
我还可以确认存在大约1的偏移,至少对于第2列。看起来高斯函数中的x值与数据的x值不同。