我在这里阅读了一些主题,但我仍然感到困惑。
我认为scipy stats(连续随机变量)函数stats.rv_name.pdf(x,loc,scale,* params)的总和应为1。
我基本上使用以下代码拟合了散点图数据。 我确实获得了1.0的累积值(最终)。但是我的pdf_fitted并不等于1。
我仍然不明白为什么会这样,以及如何在pdf输出中获取参数,使其可以加起来为一个。
这里有一个相关的主题:Why does scipy.norm.pdf sometimes give PDF > 1? How to correct it?
def py_DistEstimate(arr1, disType, reSults='params', bins = 20):
dist_names = ['gamma', 'beta', 'rayleigh', 'norm', 'pareto']
dist = getattr(stats, disType)
param = dist.fit(arr1)
x = linspace(min(arr1), max(arr1), bins)
pdf_fitted = dist.pdf(x, loc=param[-2], scale=param[-1], *param[:-2])
cdf_fitted = dist.cdf(x, loc=param[-2], scale=param[-1], *param[:-2])
if reSults == 'pdf':
digitizeV = np.digitize(arr1, x, right = True)
bin_counV = np.bincount(digitizeV, weights = None)
bin_probV = bin_counV/len(arr1)
return pd.DataFrame({'x-axis':x, 'pdf':pdf_fitted, 'original':bin_probV, 'cdf':cdf_fitted})
elif reSults == 'params':
parameter_names = [p for p in inspect.signature(dist._pdf).parameters if not p=='x'] + ["loc","scale"]
return pd.DataFrame({'names':parameter_names, 'values':param})
elif reSults == 'listparams':
dist_continu = [d for d in dir(stats) if isinstance(getattr(stats, d), stats.rv_continuous)]
return dist_continu