我制作了可复制的数据集。
在此数据集中,我尝试获取按“值”和“类别”分组的列,并仅在其中“值”的值大于4的情况下,才能在“类别”中获得所有值的最大值group_by
提出问题的另一种方法是,只有在每个“类别”的“值”大于4的情况下,才能为每个标签获取最大的“值”
das <- data.frame(val=1:24,
weigh=c(10,10,10,11,11,11,20,20,20,21,21,21,30,30,30,31,31,31,40,40,40,41,41,41),
value=c(4.1,3.2,4.3,1.1,2.2,5.3,2.1,2.2,3.3,3.1,8.2,1.3,3.6,2.1,3.1,3.1,3.1,1.1,7.2,4.5,5.1,3.2,2.5,9.1),
label=c(1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,4,4,4,4,4,4),
category=c("A","B","C","A","B","C","A","B","C","A","B","C","A","B","C","A","B","C","A","B","C","A","B","C"))
val weigh value label category
1 1 10 4.1 1 A
2 2 10 3.2 1 B
3 3 10 4.3 1 C
4 4 11 1.1 1 A
5 5 11 2.2 1 B
6 6 11 5.3 1 C
7 7 20 2.1 2 A
8 8 20 2.2 2 B
9 9 20 3.3 2 C
10 10 21 3.1 2 A
11 11 21 8.2 2 B
12 12 21 1.3 2 C
13 13 30 3.6 3 A
14 14 30 2.1 3 B
15 15 30 3.1 3 C
16 16 31 3.1 3 A
17 17 31 3.1 3 B
18 18 31 1.1 3 C
19 19 40 7.2 4 A
20 20 40 4.5 4 B
21 21 40 5.1 4 C
22 22 41 3.2 4 A
23 23 41 2.5 4 B
24 24 41 9.1 4 C
这是预期的输出
val weigh value label category
1 1 10 4.1 1 A
5 6 11 5.3 1 C
2 2 10 3.2 1 B
10 10 21 3.1 2 A
3 11 21 8.2 2 B
9 9 20 3.3 2 C
2 19 40 7.2 4 A
4 20 40 4.5 4 B
6 24 41 9.1 4 C
我尝试了以下操作,但未获得预期的输出。在这里,我只得到值> 4,而不是带有该标签的该类别中最大的数字
das1 <- das[das$value >4,]
result <- das1 %>%
group_by(category,label) %>%
slice(which.max(value))
val weigh value label category
1 1 10 4.1 1 A
5 6 11 5.3 1 C
3 11 21 8.2 2 B
2 19 40 7.2 4 A
4 20 40 4.5 4 B
6 24 41 9.1 4 C
答案 0 :(得分:3)
我们可以首先group_by
label
和filter
个具有any
value > 4
的组,然后仅选择max
{{1} } value
和label
中的}。
category
答案 1 :(得分:3)
我认为您的措辞描述令人困惑,因为您一直在说不同的话。这符合您的预期输出,解释是
仅当该“ 标签”中的“值”大于4时,才为每个标签的每个“类别”获得最大的“值”(在OP中是指类别)
library(tidyverse)
das <- data.frame(
val = 1:24,
weigh = c(10, 10, 10, 11, 11, 11, 20, 20, 20, 21, 21, 21, 30, 30, 30, 31, 31, 31, 40, 40, 40, 41, 41, 41),
value = c(4.1, 3.2, 4.3, 1.1, 2.2, 5.3, 2.1, 2.2, 3.3, 3.1, 8.2, 1.3, 3.6, 2.1, 3.1, 3.1, 3.1, 1.1, 7.2, 4.5, 5.1, 3.2, 2.5, 9.1),
label = c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4),
category = c("A", "B", "C", "A", "B", "C", "A", "B", "C", "A", "B", "C", "A", "B", "C", "A", "B", "C", "A", "B", "C", "A", "B", "C")
)
das %>%
group_by(label) %>%
filter(any(value > 4)) %>%
group_by(label, category) %>%
filter(value == max(value)) %>%
arrange(label, category)
#> # A tibble: 9 x 5
#> # Groups: label, category [9]
#> val weigh value label category
#> <int> <dbl> <dbl> <dbl> <fct>
#> 1 1 10 4.1 1 A
#> 2 2 10 3.2 1 B
#> 3 6 11 5.3 1 C
#> 4 10 21 3.1 2 A
#> 5 11 21 8.2 2 B
#> 6 9 20 3.3 2 C
#> 7 19 40 7.2 4 A
#> 8 20 40 4.5 4 B
#> 9 24 41 9.1 4 C
由reprex package(v0.2.1)于2019-03-07创建