Python中的fitdist和histfit等效于什么?

时间:2019-01-23 16:35:09

标签: python matlab matplotlib scipy

---示例---

我有一个数据集(样本),其中包含一维数组(see the attached .json file)中的1000个损伤值(该值非常小<1e-6 )。该示例似乎遵循对数正态分布: Histogram of data set (sample) and its counts

---问题和我已经尝试过的问题---

我尝试了本帖子Fitting empirical distribution to theoretical ones with Scipy (Python)?和本帖子Scipy: lognormal fitting中的建议,以通过对数正态分布拟合我的数据。这些都不起作用。 :(

我总是在Y轴上得到非常大的东西,如下所示:

plot fitting distribution

这是我在Python中使用的代码(可以从here下载 data.json 文件):

from matplotlib import pyplot as plt
from scipy import stats as scistats
import json
with open("data.json", "r") as f:
  sample = json.load(f) # load data: a 1000 * 1 array with many small values( < 1e-6)
fig, axis = plt.subplots() # initiate a figure
N, nbins, patches = axis.hist(sample, bins = 40) # plot sample by histogram
axis.ticklabel_format(style = 'sci', scilimits = (-3, 4), axis = 'x') # make X-axis to use scitific numbers
axis.set_xlabel("Value")
axis.set_ylabel("Count")    
plt.show()

fig, axis = plt.subplots()
param = scistats.lognorm.fit(sample) # fit data by Lognormal distribution
pdf_fitted = scistats.lognorm.pdf(nbins, * param[: -2], loc = param[-2], scale = param[-1]) # prepare data for ploting fitted distribution
axis.plot(nbins, pdf_fitted) # draw fitted distribution on the same figure
plt.show()

我尝试了另一种分布,但是当我尝试绘制结果时,Y轴始终太大,无法使用直方图进行绘制。我在哪里失败了?

我还在另一个问题Use scipy lognormal distribution to fit data with small values, then show in matplotlib中尝试了该建议。但是变量pdf_fitted的值总是太大。

---预期结果---

基本上,我想要的是这样的

enter image description here

这是我在上面的屏幕截图中使用的Matlab代码:

fname = 'data.json';
sample = jsondecode(fileread(fname));

% fitting distribution
pd = fitdist(sample, 'lognormal')

% A combined command for plotting histogram and distribution
figure();
histfit(sample,40,"lognormal")

因此,如果您对Python / Scipy / Numpy / Matplotlib中的fitdisthistfit的等效命令有任何了解,请发布它!

非常感谢!

3 个答案:

答案 0 :(得分:3)

尝试使用distfit(或fitdist)库。

https://erdogant.github.io/distfit/

pip install distfit

import numpy as np

# Example data
X = np.random.normal(10, 3, 2000)
y = [3,4,5,6,10,11,12,18,20]

# From the distfit library import the class distfit
from distfit import distfit

# Initialize
dist = distfit()

# Search for best theoretical fit on your emperical data
dist.fit_transform(X)

# Plot
dist.plot()

# summay plot
dist.plot_summary()

因此,您的情况应该是:

dist = distfit(distr='lognorm')
dist.fit_transform(X)

答案 1 :(得分:0)

尝试seaborn:

print (grade(mark=int(input("Please enter the students mark: "))))

enter image description here

答案 2 :(得分:0)

我使用Openturns库尝试了您的数据集

x是json文件中给出的列表。

import openturns as ot
from openturns.viewer import View
import matplotlib.pyplot as plt

# first format your list x as a sample of dimension 1
sample = ot.Sample(x,1) 

# use the LogNormalFactory to build a Lognormal distribution according to your sample
distribution = ot.LogNormalFactory().build(sample)

# draw the pdf of the obtained distribution
graph = distribution.drawPDF()
graph.setLegends(["LogNormal"])
View(graph)
plt.show()

enter image description here

如果要分配参数

print(distribution)
>>> LogNormal(muLog = -16.5263, sigmaLog = 0.636928, gamma = 3.01106e-08)

您可以通过调用HistogramFactory以相同的方式构建直方图,然后可以将一个图形添加到另一个图形:

graph2 = ot.HistogramFactory().build(sample).drawPDF()
graph2.setColors(['blue'])
graph2.setLegends(["Histogram"])
graph2.add(graph)
View(graph2)

并设置边界值(如果要缩放)

axes = view.getAxes()
_ = axes[0].set_xlim(-0.6e-07, 2.8e-07)
plt.show()

enter image description here