我的数据集看起来有点类似于:
Category Value1 Value2
A 19 143
A 12 124
A 21 130
B 23 323
B 24 323
B 23 342
B 24 233
B 27 234
C 28 212
C 29 233
D 11 365
D 12 323
D 13 344
此数据集有许多类别即。 A,B,C,D等和两列
我们怎样才能在类别中找出这些价值的第90个百分点?
输出应采用以下格式:
答案 0 :(得分:5)
尝试
library(dplyr)
df1 %>%
group_by(Category) %>%
summarise_each(funs(quantile(., 0.90)))
# Category Value1 Value2
#1 A 20.6 140.4
#2 B 25.8 334.4
#3 C 28.9 230.9
#4 D 12.8 360.8
或者
library(data.table)
setDT(df1)[, lapply(.SD, FUN=quantile, prob=0.90), Category]
或使用aggregate
base R
aggregate(.~Category, df1, FUN=quantile, prob=0.90)
df1 <- structure(list(Category = c("A", "A", "A", "B", "B", "B", "B",
"B", "C", "C", "D", "D", "D"), Value1 = c(19, 12, 21, 23, 24,
23, 24, 27, 28, 29, 11, 12, 13), Value2 = c(143, 124, 130, 323,
323, 342, 233, 234, 212, 233, 365, 323, 344)), .Names = c("Category",
"Value1", "Value2"), row.names = c(NA, -13L), class = "data.frame")