Question

我想使用CDF匹配来纠正降水的原始模型预测（但应用程序相当通用）。

假设下面的CDF B是观察到的CDF（我信任的CDF），我想计算CDF A和B之间的差异，以便在给定的一天我可以采取降水预测并将其偏移A之间的差异和B使它更能代表B而不是A.

因此，对于每个x值，我需要得到A的y值，然后其中B是我需要获得x值的相同值，给出2个x值来计算差值。

当然，这只给出了我知道校正的离散x值，所以我想我还需要做额外的工作来校正介于2个其他值之间的x值。

以下是我用于生成示例的Python代码：

import numpy.random
import numpy as np
from scipy.interpolate import interp1d
import matplotlib.pyplot as plt

quantiles = [0, 1, 2, 3, 4, 5, 7, 10, 15, 20, 30, 40, 50, 60, 75, 100]

# Generate fake precip data
sample_size = 100000
A = numpy.random.gamma(0.7, scale=50, size=sample_size)
B = numpy.random.gamma(0.5, scale=70, size=sample_size)
ens = (40 - 20) * np.random.random_sample((21)) + 20

# Calculate histograms
A_pdf, edges = np.histogram(A, bins=quantiles)
A_pdf = A_pdf / sample_size
A_cdf = np.cumsum(A_pdf)
B_pdf, edges = np.histogram(B, bins=quantiles)
B_pdf = B_pdf / sample_size
B_cdf = np.cumsum(B_pdf)

# Plot CDFs
plt.figure()
plt.plot(quantiles[1:], A_cdf, 'x-', c='r', lw=3, ms=10, mew=2, label='A')
plt.plot(quantiles[1:], B_cdf, '+-', c='k', lw=3, ms=15, mew=2, label='B')
plt.xticks(quantiles[1:])
plt.legend(loc='upper left')

谢谢大家！

Answer 1

您需要的是一个近似于A的CDF的函数，以及一个近似于B的逆CDF（或PPF）的函数。然后您将简单地计算 q _correction = PPF <子>乙（CDF <子> A （q））

对于您的示例数据，我们只需使用适当参数的.cdf frozen distributions .ppf和scipy.stats.gamma方法：

from scipy import stats

distA = stats.gamma(0.7, scale=50)
distB = stats.gamma(0.5, scale=70)

corrected_quantiles = distB.ppf(distA.cdf(quantiles[1:]))

当然，对于真实数据，您不太可能知道真实基础分布的参数。如果您对其功能形式有很好的了解，可以尝试对数据进行最大似然拟合，以便估算它们：

distA = stats.gamma(*stats.gamma.fit(A))
distB = stats.gamma(*stats.gamma.fit(B))

如果做不到这一点，您可以尝试从经验CDF进行插值/推断，例如使用scipy.interpolate.InterpolatedUnivariateSpline：

from scipy.interpolate import InterpolatedUnivariateSpline

# cubic spline interpolation
itp_A_cdf = InterpolatedUnivariateSpline(quantiles[1:], A_cdf, k=3)
# the PPF is the inverse of the CDF, so we simply reverse the order of the
# x & y arguments to InterpolatedUnivariateSpline
itp_B_ppf = InterpolatedUnivariateSpline(B_cdf, quantiles[1:], k=3)

itp_corrected_quantiles = itp_B_ppf(itp_A_cdf(quantiles[1:]))

fig, ax = plt.subplots(1, 1)
ax.hold(True)
ax.plot(quantiles[1:], A_cdf, '-r', lw=3, label='A')
ax.plot(quantiles[1:], B_cdf, '-k', lw=3, label='B')
ax.plot(corrected_quantiles, A_cdf, '--xr', lw=3, ms=10, mew=2, label='exact')
ax.plot(itp_corrected_quantiles, A_cdf, '--+b', lw=3, ms=10, mew=2,
        label='interpolated')
ax.legend(loc=5)

如何操作样本的CDF以使其与不同样本的CDF相匹配？

1 个答案: