Question

我有一个变量，该变量对于用户的所有行都具有相同的值，即游戏中达到的最大分数。我现在想过滤数据集，以便只有那些保留在变量max_score的75％分位数以上的用户才能使用。我想保留基本的记录格式，所以不能使用摘要。

这是一个示例数据集：

da <- data.frame(user = c(1,1,1,2,2,2,3,3,3), max_score=as.numeric(c(150,150,150,100,100,100,75,75,75)))

da
  user max_score
1    1       150
2    1       150
3    1       150
4    2       100
5    2       100
6    2       100
7    3        75
8    3        75
9    3        75

我尝试了以下方法：

da2= da %>% group_by(user) %>% filter(max(max_score) > quantile(max(max_score), .75))

...但是它不起作用。

Answer 1

您想要的预期输出是什么？假设是：

user max_score
1    150

da2 <- 
  da %>% 
  filter(max_score >= as.numeric(quantile(max(da$max_score), .75))) %>%
  unique()

如果没有，很高兴为您提供进一步的帮助。

根据分组数据结构中的分位数进行过滤

1 个答案: