---示例---
我有一个数据集(样本),其中包含一维数组(see the attached .json file)中的1000个损伤值(该值非常小<1e-6 )。该示例似乎遵循对数正态分布:
---问题和我已经尝试过的问题---
我尝试了本帖子Fitting empirical distribution to theoretical ones with Scipy (Python)?和本帖子Scipy: lognormal fitting中的建议,以通过对数正态分布拟合我的数据。这些都不起作用。 :(
我总是在Y轴上得到非常大的东西,如下所示:
这是我在Python中使用的代码(可以从here下载 data.json 文件):
from matplotlib import pyplot as plt
from scipy import stats as scistats
import json
with open("data.json", "r") as f:
sample = json.load(f) # load data: a 1000 * 1 array with many small values( < 1e-6)
fig, axis = plt.subplots() # initiate a figure
N, nbins, patches = axis.hist(sample, bins = 40) # plot sample by histogram
axis.ticklabel_format(style = 'sci', scilimits = (-3, 4), axis = 'x') # make X-axis to use scitific numbers
axis.set_xlabel("Value")
axis.set_ylabel("Count")
plt.show()
fig, axis = plt.subplots()
param = scistats.lognorm.fit(sample) # fit data by Lognormal distribution
pdf_fitted = scistats.lognorm.pdf(nbins, * param[: -2], loc = param[-2], scale = param[-1]) # prepare data for ploting fitted distribution
axis.plot(nbins, pdf_fitted) # draw fitted distribution on the same figure
plt.show()
我尝试了另一种分布,但是当我尝试绘制结果时,Y轴始终太大,无法使用直方图进行绘制。我在哪里失败了?
我还在另一个问题Use scipy lognormal distribution to fit data with small values, then show in matplotlib中尝试了该建议。但是变量pdf_fitted
的值总是太大。
---预期结果---
基本上,我想要的是这样的
这是我在上面的屏幕截图中使用的Matlab代码:
fname = 'data.json';
sample = jsondecode(fileread(fname));
% fitting distribution
pd = fitdist(sample, 'lognormal')
% A combined command for plotting histogram and distribution
figure();
histfit(sample,40,"lognormal")
因此,如果您对Python / Scipy / Numpy / Matplotlib中的fitdist
和histfit
的等效命令有任何了解,请发布它!
非常感谢!
答案 0 :(得分:3)
尝试使用distfit(或fitdist)库。
https://erdogant.github.io/distfit/
pip install distfit
import numpy as np
# Example data
X = np.random.normal(10, 3, 2000)
y = [3,4,5,6,10,11,12,18,20]
# From the distfit library import the class distfit
from distfit import distfit
# Initialize
dist = distfit()
# Search for best theoretical fit on your emperical data
dist.fit_transform(X)
# Plot
dist.plot()
# summay plot
dist.plot_summary()
因此,您的情况应该是:
dist = distfit(distr='lognorm')
dist.fit_transform(X)
答案 1 :(得分:0)
答案 2 :(得分:0)
我使用Openturns库尝试了您的数据集
x是json文件中给出的列表。
import openturns as ot
from openturns.viewer import View
import matplotlib.pyplot as plt
# first format your list x as a sample of dimension 1
sample = ot.Sample(x,1)
# use the LogNormalFactory to build a Lognormal distribution according to your sample
distribution = ot.LogNormalFactory().build(sample)
# draw the pdf of the obtained distribution
graph = distribution.drawPDF()
graph.setLegends(["LogNormal"])
View(graph)
plt.show()
如果要分配参数
print(distribution)
>>> LogNormal(muLog = -16.5263, sigmaLog = 0.636928, gamma = 3.01106e-08)
您可以通过调用HistogramFactory以相同的方式构建直方图,然后可以将一个图形添加到另一个图形:
graph2 = ot.HistogramFactory().build(sample).drawPDF()
graph2.setColors(['blue'])
graph2.setLegends(["Histogram"])
graph2.add(graph)
View(graph2)
并设置边界值(如果要缩放)
axes = view.getAxes()
_ = axes[0].set_xlim(-0.6e-07, 2.8e-07)
plt.show()