我使用ecdf
绘制了速度的累积分布,但我也希望得到累积概率的输出,如下所示:
Speed Cumulative Probability
40 0.20
45 0.45
55 0.51
60 0.70
70 0.90
80 1.00
对于我的数据,当我使用ecdf
时,它会让我关注(请注意'cc'是我的原始数据框):
> ccf <- subset(cc, cc$svel>=55 & cc$Headway>=4)
> cdf<- ecdf(ccf$svel)
> cdf
Empirical CDF
Call: ecdf(ccf$svel)
x[1:356] = 55, 55.01, 55.02, ..., 76.76, 76.8
如何获得上面示例中的表格?请注意,我尝试了'cumsum
',但它只给出累积频率,而我需要累积概率。
这是我的数据:
dput(CCF $ svel) c(67.9,67.62,67.37,67.19,67.04,66.93,66.83,66.74,66.65, 66.55,66.46,66.36,66.25,66.12,65.97,61.12,61.2,61.29, 61.39,61.49,61.58,61.66,61.73,61.79,57.98,57.73,57.5, 57.29,57.1,56.92,56.75,56.59,56.45,56.32,56.19,58,58.18, 58.36,58.52,58.69,56.28,56.19,56.08,55.96,55.83,55.68, 55.52,55.34,55.15,58.58,58.89,59.17,59.4,59.58,55.01, 55.14,55.23,55.3,55.36,55.41,55.47,55.53,55.59,55.66, 55.74,55.83,55.92,56.03,56.16,56.3,56.44,56.58,56.71, 56.82,56.91,56.98,57.03,57.06,57.07,57.07,57.06,57.04, 57.02,55.05,55.22,55.39,55.56,55.73,55.92,56.11,56.31, 56.53,56.77,57.02,57.28,57.54,57.79,58,58.18,58.32,58.43, 58.5,58.56,58.6,58.64,58.68,58.73,58.8,58.86,58.92,58.97, 59.01,59.03,59.05,59.05,59.04,59.02,58.99,58.97,58.95, 55.1,55.39,55.68,55.97,56.24,56.48,56.68,56.82,56.9, 56.94,56.96,56.97,56.99,57.02,57.07,57.14,57.22,57.3, 57.37,57.41,57.45,57.48,57.51,57.56,57.62,57.69,57.77, 57.86,57.95,58.06,58.17,58.29,58.42,58.53,58.64,58.74, 58.83,58.91,58.98,55.01,55.08,55.15,55.22,55.3,55.37, 55.45,55.53,55.62,55.73,55.85,55.99,56.14,56.31,56.49, 56.67,56.87,57.05,57.22,57.37,57.51,57.65,57.79,57.95, 58.13,58.3,58.47,58.63,58.78,58.91,59.03,59.14,59.24, 59.34,59.43,59.53,59.62,59.72,59.81,59.9,59.98,60.07, 60.15,60.22,60.31,60.39,60.47,60.56,60.65,60.75,60.86, 60.98,61.11,61.24,61.39,61.54,61.71,61.89,62.09,62.31, 62.56,62.84,63.14,63.46,63.78,64.08,64.81,64.84,64.85, 64.87,64.89,64.92,64.94,64.97,65,65.02,65.04,65.07,65.11, 65.15,65.17,65.18,65.17,65.15,65.13,65.1,65.06,65.01, 64.96,64.9,64.84,64.79,64.76,55.04,55.15,55.25,55,55.23, 55.45,55.68,55.9,56.69,56.74,55,55,55,55,55,55.01, 55.26,55.51,55.77,56.02,56.28,56.56,56.84,57.13,57.42, 57.7,57.98,58.25,58.49,58.73,58.94,59.13,59.29,59.4, 59.48,59.5,59.48,59.42,59.31,59.17,59,58.8,58.6,58.38, 58.17,57.96,57.77,57.59,57.44,57.31,57.21,57.13,57.07, 57.04,57.03,57.04,57.07,57.11,57.18,57.26,57.34,57.43, 57.51,57.59,57.68,57.78,57.88,57.99,58.08,58.16,58.22, 58.27,58.3,58.31,58.31,58.3,58.27,58.25,58.22,58.18, 58.14,58.08,58.01,57.93,57.84,57.72,57.59,57.43,57.27, 57.1,56.93,56.77,56.63,56.5,56.38,56.28,56.19,56.12, 56.05,55.99,55.94,55.9,55.88,55.86,55.85,55.86,55.87, 55.89,55.9,55.91,55.91,55.88,55.84,55.78,55.71,55.63, 55.56,55.5,55.45,55.4,55.37,55.34,55.32,55.3,55.29,55.27, 55.26,55.26,55.25,55.25,55.26,55.26,55.27,55.28,55.29, 55.31,55.33,55.36,55.39,55.02,55.07,55.12,55.16,55.21, 55.26,55.31,55.04,55.21,55.38,55.54,55.71,55.88,56.05, 56.21,56.38,56.54,56.71,56.88,57.04,57.2,57.35,55.46, 55.59,55.74,55.92,56.11,56.32,56.54,56.77,57.02,57.28, 55.22,55.28,55.35,55.42,55.5,55.58,55.68,55.78,55.88, 56,55.15,55.45,55.72,55.94,56.11,56.22,56.29,56.33,56.36, 56.4,56.45,56.51,56.59,56.69,56.81,56.95,57.11,57.27, 57.44,57.61,57.78,57.95,58.12,58.29,58.46,58.63,58.79, 58.94,59.08,59.21,59.32,59.41,55.13,55.3,55.47,55.65, 55.83,56.02,56.22,56.43,56.66,56.9,55.17,56.02,56.11, 56.21,56.32,56.42,56.52,57.18,57.29,57.42,76.27,76.28, 76.3,76.33,76.37,76.41,76.47,76.54,76.62,76.7,76.76, 76.8,76.8,55.08,55.16,55.24,55.32,55.4,55.48,55.12,55.39, 55.67,55.94,56.21,56.47,56.72,56.97,57.19,57.4,57.58, 57.73,57.87,57.99,58.11)
答案 0 :(得分:1)
这是一个可以执行此操作的函数:
cumprob <- function(y) {
fun <- function(y, x) length(y[y<x])/length(y)
prob<-sapply(y, fun, y=y)
data<- data.frame(value=unique(y[order(y)]), prob=unique(prob[order(prob)]))
}
测试您的数据(此处我称之为data
):
cp<-cumprob(data)
head(cp)
value prob
1 55.00 0.00000000
2 55.01 0.01156069
3 55.02 0.01734104
4 55.04 0.01926782
5 55.05 0.02312139
6 55.07 0.02504817
简介:
plot(cp)
我发现另一种非常方便的快捷方式是使用hist
函数自动cut
数据并获取中点。
将您的数据用作data
:
h <- hist(data)
cum.prob <- data.frame(value=h$mids, prob=cumsum(h$counts)/sum(h$counts))
这会给你:
cum.prob
value prob
1 55 0.2793834
2 57 0.6319846
3 59 0.8285164
4 61 0.8786127
5 63 0.8921002
6 65 0.9479769
7 67 0.9749518
8 69 0.9749518
9 71 0.9749518
10 73 0.9749518
11 75 0.9749518
12 77 1.0000000