在字符向量处按分位数过滤

时间:2017-12-30 18:25:58

标签: r dplyr

我有一个字符向量,我想从中筛选出第95个分位数。

如果使用以下命令,它将更改我的数据框(即只剩下nname)。

  mydf %>% 
  count(name) %>%
  filter(n > quantile(n, 0.95))

如果我使用此命令,则会收到错误。

  mydf %>% 
  group_by(name) %>%
  filter(name > quantile(name, 0.95))

  Error in filter_impl(.data, quo) : Evaluation error: non-numeric argument 
  to binary operator.

这是一个小dupt

structure(list(name = c("Panda Express", "Noodles & Company", 
"Panda Express", "Panda Express", "Panda Express", "Panda Express", 
"Panda Express", "Noodles & Company", "Noodles & Company", "China"
), postal_code = c("85301", "85382", "89122", "89134", "85296", 
"85042", "89012", "15241", "85236", "85018")), .Names = c("name", 
"postal_code"), row.names = c(NA, 10L), class = "data.frame"))

1 个答案:

答案 0 :(得分:2)

我们可以在semi_join

之后使用filter
library(dplyr)
df %>% 
  count(name) %>% 
  filter(n > quantile(n, 0.95)) %>%
  semi_join(df, ., by = 'name')