Question

---示例---

我有一个数据集（样本），其中包含一维数组（see the attached .json file）中的1000个损伤值（该值非常小<1e-6 ）。该示例似乎遵循对数正态分布：

---问题和我已经尝试过的问题---

我尝试了本帖子Fitting empirical distribution to theoretical ones with Scipy (Python)?和本帖子Scipy: lognormal fitting中的建议，以通过对数正态分布拟合我的数据。这些都不起作用。：（

我总是在Y轴上得到非常大的东西，如下所示：

这是我在Python中使用的代码（可以从here下载 data.json 文件）：

from matplotlib import pyplot as plt
from scipy import stats as scistats
import json
with open("data.json", "r") as f:
  sample = json.load(f) # load data: a 1000 * 1 array with many small values( < 1e-6)
fig, axis = plt.subplots() # initiate a figure
N, nbins, patches = axis.hist(sample, bins = 40) # plot sample by histogram
axis.ticklabel_format(style = 'sci', scilimits = (-3, 4), axis = 'x') # make X-axis to use scitific numbers
axis.set_xlabel("Value")
axis.set_ylabel("Count")    
plt.show()

fig, axis = plt.subplots()
param = scistats.lognorm.fit(sample) # fit data by Lognormal distribution
pdf_fitted = scistats.lognorm.pdf(nbins, * param[: -2], loc = param[-2], scale = param[-1]) # prepare data for ploting fitted distribution
axis.plot(nbins, pdf_fitted) # draw fitted distribution on the same figure
plt.show()

我尝试了另一种分布，但是当我尝试绘制结果时，Y轴始终太大，无法使用直方图进行绘制。我在哪里失败了？

我还在另一个问题Use scipy lognormal distribution to fit data with small values, then show in matplotlib中尝试了该建议。但是变量pdf_fitted的值总是太大。

---预期结果---

基本上，我想要的是这样的

这是我在上面的屏幕截图中使用的Matlab代码：

fname = 'data.json';
sample = jsondecode(fileread(fname));

% fitting distribution
pd = fitdist(sample, 'lognormal')

% A combined command for plotting histogram and distribution
figure();
histfit(sample,40,"lognormal")

因此，如果您对Python / Scipy / Numpy / Matplotlib中的fitdist和histfit的等效命令有任何了解，请发布它！

非常感谢！

Answer 1

尝试使用distfit（或fitdist）库。

https://erdogant.github.io/distfit/

pip install distfit

import numpy as np

# Example data
X = np.random.normal(10, 3, 2000)
y = [3,4,5,6,10,11,12,18,20]

# From the distfit library import the class distfit
from distfit import distfit

# Initialize
dist = distfit()

# Search for best theoretical fit on your emperical data
dist.fit_transform(X)

# Plot
dist.plot()

# summay plot
dist.plot_summary()

因此，您的情况应该是：

dist = distfit(distr='lognorm')
dist.fit_transform(X)

Answer 2

尝试seaborn：

print (grade(mark=int(input("Please enter the students mark: "))))

Answer 3

我使用Openturns库尝试了您的数据集

x是json文件中给出的列表。

import openturns as ot
from openturns.viewer import View
import matplotlib.pyplot as plt

# first format your list x as a sample of dimension 1
sample = ot.Sample(x,1) 

# use the LogNormalFactory to build a Lognormal distribution according to your sample
distribution = ot.LogNormalFactory().build(sample)

# draw the pdf of the obtained distribution
graph = distribution.drawPDF()
graph.setLegends(["LogNormal"])
View(graph)
plt.show()

如果要分配参数

print(distribution)
>>> LogNormal(muLog = -16.5263, sigmaLog = 0.636928, gamma = 3.01106e-08)

您可以通过调用HistogramFactory以相同的方式构建直方图，然后可以将一个图形添加到另一个图形：

graph2 = ot.HistogramFactory().build(sample).drawPDF()
graph2.setColors(['blue'])
graph2.setLegends(["Histogram"])
graph2.add(graph)
View(graph2)

并设置边界值（如果要缩放）

axes = view.getAxes()
_ = axes[0].set_xlim(-0.6e-07, 2.8e-07)
plt.show()

Python中的fitdist和histfit等效于什么？

3 个答案: