我有两组数据
我绘制了两个概率密度函数。现在,我希望两个概率密度函数之间的区域在一定的x范围内。
我试图对面积,梯形规则等进行整合:
Calculating the area between a curve and a straight line without finding the function
Error calculating the area between two lines using "integrate"
How to measure area between 2 distribution curves in R / ggplot2
但都是徒劳的。
这是我正在处理的数据的链接。
dens.pre=density(TX/10)
dens.post=density(TX30/10)`
plot(dens.pre,col="green")
lines(dens.post,col="red")
locator()
#$x
#[1] 18.36246
#$y
#[1] 0.05632428
abline(v=18.3,col="red")
找到X> 18.3的两条曲线之间的区域。
答案 0 :(得分:0)
使用梯形规则,您可能可以这样计算:
d0 <- dens.pre
d1 <- dens.post
f0 <- approxfun(d0$x, d0$y)
f1 <- approxfun(d1$x, d1$y)
# defining x range of the density overlap
ovrng <- c(18.3, min(max(d0$x), max(d1$x)))
# dividing it to sections (for example n=500)
i <- seq(min(ovrng), max(ovrng), length.out=500)
# calculating the distance between the density curves
h1 <- f0(i)-f1(i)
h2 <- f1(i)-f0(i)
#and using the formula for the area of a trapezoid we add up the areas
area1<-sum( (h1[-1]+h1[-length(h1)]) /2 *diff(i) *(h1[-1]>=0+0)) # for the regions where d1>d0
area2<-sum( (h2[-1]+h2[-length(h2)]) /2 *diff(i) *(h2[-1]>=0+0)) # for the regions where d1<d0
area_total <- area1 + area2
area_total
但是,由于您只对一条曲线在整个范围内保持在另一条曲线下方的区域感兴趣,因此可以将其缩短:
d0 <- dens.pre
d1 <- dens.post
f0 <- approxfun(d0$x, d0$y)
f1 <- approxfun(d1$x, d1$y)
# defining x range of the density overlap
ovrng <- c(18.3, min(max(d0$x), max(d1$x)))
# dividing it to sections (for example n=500)
i <- seq(min(ovrng), max(ovrng), length.out=500)
# calculating the distance between the density curves
h1 <- f1(i)-f0(i)
#and using the formula for the area of a trapezoid we add up the areas where d1>d0
area<-sum( (h1[-1]+h1[-length(h1)]) /2 *diff(i) *(h1[-1]>=0+0))
area
#We can plot the region using
plot(d0, main="d0=black, d1=green")
lines(d1, col="green")
jj<-which(h>0 & seq_along(h) %% 5==0); j<-i[jj];
segments(j, f1(j), j, f1(j)-h[jj])