我有两个数据集,其中两个值在哪里测量。我对差异的价值和标准差之间的差异感兴趣。我做了一个直方图,我想要适合两个正态分布。计算最大值之间的差异。我还想评估在数据集中我对一个值的数据少得多的影响。我已经看过这个链接,但它并不是我真正需要的东西: Python: finding the intersection point of two gaussian curves
for ii in range(2,8):
# Kanal = ii - 1
file = filepath + '\Mappe1.txt'
data = np.loadtxt(file, delimiter='\t', skiprows=1)
data = data[:,ii]
plt.hist(data,bins=100)
plt.xlabel("bins")
plt.ylabel("Counts")
plt.tight_layout()
plt.grid()
plt.figure()
plt.show()
答案 0 :(得分:0)
使用 scipy
:
from scipy.optimize import curve_fit #non linear curve fitting tool
from matplotlib import pyplot as plt
def func2fit(x1,x2,m_1,m_2,std_1,std_2,height1, height2): #define a simple gauss curve
return height1*exp(-(x1-m_1)**2/2/std_1**2)+height2*exp(-(x2-m_2)**2/2/std_2**2)
init_guess=(-.3,.3,.5,.5,3000,3000)
#contains the initial guesses for the parameters (m_1, m_2, std_1, std_2, height1, height2) using your first figure
#do the fitting
fit_pars, pcov =curve_fit(func2fit,xdata,ydata,init_guess)
#fit_pars contains the mean, the heights and the SD values, pcov contains the estimated covariance of these parameters
plt.plot(xdata,func2fit(xdata,*fit_pars),label='fit') #plot the fit
有关进一步参考,请参阅scipy手册页: curve_fit
答案 1 :(得分:0)
假设两个样本是独立的,则不需要使用曲线拟合来处理这个问题。它的基本统计数据。这里有一些代码可以进行所需的计算,并在评论中将来源归结。
## adapted from http://onlinestatbook.com/2/estimation/difference_means.html
from random import gauss
from numpy import sqrt
sample_1 = [ gauss(0,1) for _ in range(10) ]
sample_2 = [ gauss(1,.5) for _ in range(20) ]
n_1 = len(sample_1)
n_2 = len(sample_2)
mean_1 = sum(sample_1)/n_1
mean_2 = sum(sample_2)/n_2
SSE = sum([(_-mean_1)**2 for _ in sample_1]) + sum([(_-mean_2)**2 for _ in sample_2])
df = (n_1-1) + (n_2-1)
MSE = SSE/df
n_h = 2 / ( 1/n_1 + 1/n_2 )
s_mean_diff = sqrt( 2* MSE / n_h )
print ( 'difference between means', abs(n_1-n_2))
print ( 'std dev of this difference', s_mean_diff )