R:数据drame

时间:2017-11-16 17:49:24

标签: r select quantile

我有一个包含组和值的数据框。首先,我计算每组99%的分位数。现在,我想删除每组99%分位数以上的值。

df<-data.frame(group = rep(c("A", "B"), each = 4),
               value = c(c(6,5,80,4,60)*10,3,5,4))

# data
  group value
1     A    60
2     A    50
3     A   800
4     A    40
5     B   600
6     B     3
7     B     5
8     B     4

计算各个组的quantils

quant<-aggregate(df$value, by = list(df$group), FUN = quantile, probs  = 0.99)

> quant
  Group.1      x
1       A 777.80
2       B 582.15

我尝试应用分位数矢量来选择较低的值。但是,它错过了组规范..

df[df$value < quant$x,]

预期结果:

  group value
1     A    60
2     A    50
4     A    40
5     B     3
6     B     5
7     B     4

如何应用分位数矢量在数据框中按组保持仅低于99%的值?

1 个答案:

答案 0 :(得分:4)

分组后我们可以if (typeof this.missing == 'undefined') { console.log(`${getTheName(this.missing)} needs to be created.`); }

filter

或与library(dplyr) df %>% group_by(group) %>% filter(value < quantile(value, probs = 0.99)) # A tibble: 6 x 2 # Groups: group [2] # group value # <fctr> <dbl> #1 A 60 #2 A 50 #3 A 40 #4 B 3 #5 B 5 #6 B 4

类似的语法
data.table

或使用library(data.table) setDT(df)[, .(value = value[value < quantile(value, probs = 0.99)]), by = group]

使用base R
ave