我有一个向量使得:
Vec <- data.frame( Vec = c("70.0600", "8.5100", "5.8600", "399.9800", "9.0600", "78.8200", "71.4600") )
我想将上述值分为“最高20%”和“最低80%”,其结果应类似于:
Vec Dec
70.0600 Top_20
. .
. .
5.8600 Bottom_80
我正在尝试类似的事情:
Vec$Quartile <- quantile(Vec$Vec, probs = c(0.20, 0.80))
但是我恰好得到了50-50%的数据值:
sum( Vec$Quartile>20 )
我不确定我在哪里错了?
答案 0 :(得分:4)
喜欢吗?
library(dplyr)
Vec <- data.frame(Vec = c(70.0600, 8.5100, 5.8600, 399.9800, 9.0600, 78.8200, 71.4600))
Vec %>%
mutate(up = quantile(Vec, .8),
part = ifelse(Vec > up, "Top_20", "Bottom_80"))
Vec up part
1 70.06 77.348 Bottom_80
2 8.51 77.348 Bottom_80
3 5.86 77.348 Bottom_80
4 399.98 77.348 Top_20
5 9.06 77.348 Bottom_80
6 78.82 77.348 Top_20
7 71.46 77.348 Bottom_80
答案 1 :(得分:3)
一种非常简单的方法,无需加载其他库:
value dec
1 399.98 Top_20
2 78.82 Top_20
3 70.06 Bottom_20
4 8.51 Bottom_20
5 5.86 Bottom_20
6 9.06 Bottom_20
7 71.46 Bottom_20
Vec <- c(70.0600, 8.5100, 5.8600, 399.9800, 9.0600, 78.8200, 71.4600)
q <- quantile(Vec, .8)
Vec <- rbind(
data.frame(value = subset(Vec, Vec > q), dec = "Top_20"),
data.frame(value = subset(Vec, Vec <= q), dec = "Bottom_20"))