dplyr-如何获取组中一列的顺序?

时间:2018-08-17 11:23:11

标签: r dplyr

示例数据:

tibbly = tibble(age = c(10,30,50,10,30,50,10,30,50,10,30,50),
              grouping1 = c("A","A","A","A","A","A","B","B","B","B","B","B"),
              grouping2 = c("X", "X", "X","Y","Y","Y","X","X","X","Y","Y","Y"),
              value = c(1,2,3,4,4,6,2,5,3,6,3,2))
> tibbly
# A tibble: 12 x 4
     age grouping1 grouping2 value
   <dbl> <chr>     <chr>     <dbl>
 1    10 A         X             1
 2    30 A         X             2
 3    50 A         X             3
 4    10 A         Y             4
 5    30 A         Y             4
 6    50 A         Y             6
 7    10 B         X             2
 8    30 B         X             5
 9    50 B         X             3
10    10 B         Y             6
11    30 B         Y             3
12    50 B         Y             2

问题: 如何获取数据框中每个组的行顺序?我可以使用dplyr以适当的形式排列数据,以可视化我感兴趣的内容:

> tibbly %>% 
     group_by(grouping1, grouping2) %>%
     arrange(grouping1, grouping2, desc(value))
# A tibble: 12 x 4
# Groups:   grouping1, grouping2 [4]
     age grouping1 grouping2 value
   <dbl> <chr>     <chr>     <dbl>
 1    50 A         X             3
 2    30 A         X             2
 3    10 A         X             1
 4    50 A         Y             6
 5    10 A         Y             4
 6    30 A         Y             4
 7    30 B         X             5
 8    50 B         X             3
 9    10 B         X             2
10    10 B         Y             6
11    30 B         Y             3
12    50 B         Y             2

最后,我对基于值列的每个组的年龄列的顺序感兴趣。使用dplyr可以做到这一点吗?类似summarise()的东西基于行的顺序而不是实际值

2 个答案:

答案 0 :(得分:2)

library(dplyr)

tibbly = tibble(age = c(10,30,50,10,30,50,10,30,50,10,30,50),
                grouping1 = c("A","A","A","A","A","A","B","B","B","B","B","B"),
                grouping2 = c("X", "X", "X","Y","Y","Y","X","X","X","Y","Y","Y"),
                value = c(1,2,3,4,4,6,2,5,3,6,3,2))


tibbly %>% 
  group_by(grouping1, grouping2) %>%                  # for each group
  arrange(desc(value)) %>%                            # arrange value descending
  summarise(order = paste0(age, collapse = ",")) %>%  # get the order of age as a strings
  ungroup()                                           # forget the grouping

# # A tibble: 4 x 3
#   grouping1 grouping2 order   
#   <chr>     <chr>     <chr>   
# 1 A         X         50,30,10
# 2 A         Y         50,10,30
# 3 B         X         30,50,10
# 4 B         Y         10,30,50

答案 1 :(得分:1)

使用then

data.table