Question

我在一组网格点上用数字估算了pdf，我想在此时确定CDF。

是否有以数字方式整合pdf的功能？函数cumsum()是否足够？当我使用cumsum()时，pdf总和为1.05而不是1。

我想比较使用K-L发散的精细网格上的两个密度，看它们是否相同。我观察到矢量Y，它取决于矩阵X（也观察到），我感兴趣的是g（y），Y的无条件分布。为了得到g（y），我首先找到关节密度，g（X，Y），然后将其除以条件密度g（X | Y）。使用R中的np包中的经验法（npundens，npudensbw，npcdens，npcdensbw）确定两个密度。因此，g（y）= g（X，Y）/ g（X | Y）。然而，当我对g（y）中的条目求和时，我得到一个大于2的数。因为我想在精细网格上计算g（y），我计算g（y）的核回归（使用带有exdat的npreg） = grid）并从此回归中保存拟合值。当我总结它们时，我得到的数字大约是1.05。

找到无条件的pdf，这种方法是否正确？为什么pdf总和不是1？

方法：＃计算Y和X的联合pdf

Z <- data.frame(Y, X)
jbw <- npudensbw(dat = Z, bwmethod = 'normal-reference', 
                 xtrim = trim, ytrim = trim)   
jpdf <- npudens(bws = jbw)

# Determine the conditional pdf of the covariates X with respect to the vector Y 

bw <- npcdensbw(xdat = Y, ydat = X, 
               bwmethod = 'normal-reference', xtrim = trim, ytrim = trim)   
cpdf <- npcdens(bws = bw, xdat = Y, ydat = X)

# Determine the unconditional pdf of Y

ft = jpdf$dens / cpdf$condens
print(sum(ft))

# Nonparametric regression of w_ft over Y in order 
to get the extended pdf evaluated in the points in grid

sigma = min(sd(Y, na.rm = TRUE), 
        mad(Y, center = median(Y), constant = 1.4826, 
        na.rm = TRUE, low = FALSE, high = FALSE) / 1.4826, 
        IQR(Y, na.rm = TRUE, type = 7)/1.349)
computed_bw = 1.06 * sigma * length(Y)^(-1.0/(2.0*2+1))

extended_pdf <- fitted(npreg(bws = computed_bw, tydat = ft, txdat = Y, exdat = grid))

从数值pdf到数字CDF

0 个答案: