我有一个字符向量,我想从中筛选出第95个分位数。
如果使用以下命令,它将更改我的数据框(即只剩下n
和name
)。
mydf %>%
count(name) %>%
filter(n > quantile(n, 0.95))
如果我使用此命令,则会收到错误。
mydf %>%
group_by(name) %>%
filter(name > quantile(name, 0.95))
Error in filter_impl(.data, quo) : Evaluation error: non-numeric argument
to binary operator.
这是一个小dupt
structure(list(name = c("Panda Express", "Noodles & Company",
"Panda Express", "Panda Express", "Panda Express", "Panda Express",
"Panda Express", "Noodles & Company", "Noodles & Company", "China"
), postal_code = c("85301", "85382", "89122", "89134", "85296",
"85042", "89012", "15241", "85236", "85018")), .Names = c("name",
"postal_code"), row.names = c(NA, 10L), class = "data.frame"))
答案 0 :(得分:2)
我们可以在semi_join
filter
library(dplyr)
df %>%
count(name) %>%
filter(n > quantile(n, 0.95)) %>%
semi_join(df, ., by = 'name')