如何在数据集中找到许多cateogories的r中的百分位数?

时间:2015-06-01 05:38:07

标签: r

我的数据集看起来有点类似于:

Category   Value1   Value2
       A       19      143
       A       12      124
       A       21      130  
       B       23      323
       B       24      323
       B       23      342
       B       24      233
       B       27      234
       C       28      212
       C       29      233      
       D       11      365
       D       12      323
       D       13      344

此数据集有许多类别即。 A,B,C,D等和两列

我们怎样才能在类别中找出这些价值的第90个百分点?

输出应采用以下格式:

enter image description here

1 个答案:

答案 0 :(得分:5)

尝试

library(dplyr)
df1 %>%
   group_by(Category) %>% 
   summarise_each(funs(quantile(., 0.90)))
#    Category Value1 Value2
#1        A   20.6  140.4
#2        B   25.8  334.4
#3        C   28.9  230.9
#4        D   12.8  360.8

或者

library(data.table)
setDT(df1)[, lapply(.SD, FUN=quantile, prob=0.90), Category]

或使用aggregate

中的base R
aggregate(.~Category, df1, FUN=quantile, prob=0.90)

数据

df1 <- structure(list(Category = c("A", "A", "A", "B", "B", "B", "B", 
"B", "C", "C", "D", "D", "D"), Value1 = c(19, 12, 21, 23, 24, 
23, 24, 27, 28, 29, 11, 12, 13), Value2 = c(143, 124, 130, 323, 
323, 342, 233, 234, 212, 233, 365, 323, 344)), .Names = c("Category", 
"Value1", "Value2"), row.names = c(NA, -13L), class = "data.frame")