在python中拟合曲线到直方图

时间:2018-06-08 13:34:39

标签: python pandas distribution curve-fitting

我从熊猫数据框中创建了一个直方图,我希望将概率分布拟合到直方图中。我自己尝试过,但曲线不够好 Histogram & Curve。 到目前为止我的代码如下:

h=sorted(df_distr['Strecke'])
m = df_distr['Strecke'].mean()
std = df_distr['Strecke'].std()
h=sorted(df_distr['Strecke'])
distr=(df_distr['Strecke'])

fig=plt.figure(figsize=(16,9))

# the histogram of the data
binwidth = range(-1,500)
n, bins, patches = plt.hist(h, bins=binwidth, normed=1, facecolor='green', alpha=0.75, histtype='step')
df = pd.DataFrame({'Strecke': bins[:-1]+1, 'Propability': n})

# add a 'best fit' line  
y = mlab.normpdf( bins, m, std)
l = plt.plot(bins, y, 'r--', linewidth=1)

是否有可能更好地适应曲线?是否还有像Halfnorm,lognorm或Weibull这样的其他发行版?

更新 最后,我可以找到我的数据集的最佳分布。实现了以下代码:

#the histogram of the data
binwidth = range(-1,500)
n, bins, patches = plt.hist(h, bins=binwidth, normed=1, facecolor='cyan', alpha=0.5, label="Histogram")
xt=plt.xticks()[0]
xmin, xmax = 0,max(xt)
lnspc = np.linspace(xmin,xmax,500)

m,s = stats.norm.fit(h)
pdf_g=stats.norm.pdf(lnspc,m,s)
#plt.plot(lnspc,pdf_g, label="Normal")

ag,bg,cg = stats.gamma.fit(h)  
pdf_gamma = stats.gamma.pdf(lnspc, ag, bg,cg)  
#plt.plot(lnspc, pdf_gamma, label="Gamma")

ab,bb,cb,db = stats.beta.fit(h)  
pdf_beta = stats.beta.pdf(lnspc, ab, bb,cb, db)  
#plt.plot(lnspc, pdf_beta, label="Beta")

gevfit = gev.fit(h)  
pdf_gev = gev.pdf(lnspc, *gevfit)  
plt.plot(lnspc, pdf_gev, label="GEV")

logfit = stats.lognorm.fit(h)  
pdf_lognorm = stats.lognorm.pdf(lnspc, *logfit)  
plt.plot(lnspc, pdf_lognorm, label="LogNormal")

weibfit = stats.weibull_min.fit(h)  
pdf_weib = stats.weibull_min.pdf(lnspc, *weibfit)  
#plt.plot(lnspc, pdf_weib, label="Weibull")

exponweibfit = stats.exponweib.fit(h)  
pdf_exponweib = stats.exponweib.pdf(lnspc, *exponweibfit)  
plt.plot(lnspc, pdf_exponweib, label="Exponential Weibull")

paretofit = stats.pareto.fit(h)
pdf_pareto = stats.pareto.pdf(lnspc, *paretofit)
plt.plot(lnspc, pdf_pareto, label ="Pareto")

plt.legend()


df = pd.DataFrame({'Strecke': bins[:-1]+1, 'Propability': n})
#R²
slope, intercept, r_value_norm, p_value, std_err = stats.linregress(df['Propability'],pdf_g)
#print ("R-squared Normal Distribution:", r_value_norm**2)

slope, intercept, r_value_gamma, p_value, std_err = stats.linregress(df['Propability'],pdf_gamma)
#print ("R-squared Gamma Distribution:", r_value_gamma**2)

slope, intercept, r_value_beta, p_value, std_err = stats.linregress(df['Propability'],pdf_beta)
#print ("R-squared Beta Distribution:", r_value_beta**2)

slope, intercept, r_value_gev, p_value, std_err = stats.linregress(df['Propability'],pdf_gev)
#print ("R-squared GEV Distribution:", r_value_gev**2)

slope, intercept, r_value_lognorm, p_value, std_err = stats.linregress(df['Propability'],pdf_lognorm)
#print ("R-squared LogNormal Distribution:", r_value_lognorm**2)

slope, intercept, r_value_weibull, p_value, std_err = stats.linregress(df['Propability'],pdf_weib)
#print ("R-squared Weibull Distribution:", r_value_weibull**2)

slope, intercept, r_value_exponweibull, p_value, std_err = stats.linregress(df['Propability'],pdf_exponweib)

slope, intercept, r_value_pareto, p_value, std_err = stats.linregress(df['Propability'],pdf_pareto)

作为一个例子,我得到了这些情节:
Final Plot of fitted curves 谢谢你的帮助!

0 个答案:

没有答案