根据目标排序顺序整理数据帧行的整洁方式

时间:2019-11-06 10:16:50

标签: r sorting dataframe dplyr

早在2015年,我就此问过类似的question,但我想找到一种整齐的方法。

这是到目前为止我能想到的最好的。它可以工作,但是以某种方式不得不更改列类型只是为了排序似乎是“错误的”。但是,诉诸dplyr::*_join()match()也有它自己的陷阱(而且很难在整洁的上下文中使用它)。

那么在tidyverse中是否有很好的/推荐的方法呢?

定义功能

library(magrittr)

arrange_by_target <- function(
  x,
  targets
) {
  x %>%
    # Transform arrange-by columns to factors so we can leverage the order of
    # the levels:
    dplyr::mutate_at(
      names(targets),
      function(.x, .targets = targets) {
        .col <- deparse(substitute(.x))
        factor(.x, levels = .targets[[.col]])
      }
    ) %>%
    # Actual arranging:
    dplyr::arrange_at(
      names(targets)
    ) %>%
    # Clean up by recasting factor columns to their original type:
    dplyr::mutate_at(
      .vars = names(targets),
      function(.x, .targets = targets) {
        .col <- deparse(substitute(.x))
        vctrs::vec_cast(.x, to = class(.targets[[.col]]))
      }
    )
}

测试

x <- tibble::tribble(
  ~group, ~name, ~value,
  "A", "B", 1,
  "A", "C", 2,
  "A", "A", 3,
  "B", "B", 4,
  "B", "A", 5
)

x %>%
  arrange_by_target(
    targets = list(
      group = c("B", "A"),
      name = c("A", "B", "C")
    )
  )
#> # A tibble: 5 x 3
#>   group name  value
#>   <chr> <chr> <dbl>
#> 1 B     A         5
#> 2 B     B         4
#> 3 A     A         3
#> 4 A     B         1
#> 5 A     C         2

x %>%
  arrange_by_target(
    targets = list(
      group = c("B", "A"),
      name = c("A", "B", "C") %>% rev()
    )
  )
#> # A tibble: 5 x 3
#>   group name  value
#>   <chr> <chr> <dbl>
#> 1 B     B         4
#> 2 B     A         5
#> 3 A     C         2
#> 4 A     B         1
#> 5 A     A         3

reprex软件包(v0.3.0)于2019-11-06创建

1 个答案:

答案 0 :(得分:0)

最简单的方法是将字符列转换为因子,如下所示:

x %>% 
  mutate(
      group = factor(group, c("A", "B")), 
      name = factor(name, c("C", "B", "A"))
  ) %>% 
  arrange(group, name)

我经常使用的另一个选项是利用联接。例如:

x <- tibble::tribble(
  ~group, ~name, ~value,
  "A", "B", 1,
  "A", "C", 2,
  "A", "A", 3,
  "B", "B", 4,
  "B", "A", 5,
  "A", "A", 6,
  "B", "C", 7,
  "A", "B", 8,
  "B", "B", 9
)

custom_sort <- tibble::tribble(
  ~group, ~name,
  "A", "C",
  "A", "B",
  "A", "A",
  "B", "B",
  "B", "A"
)

x %>% right_join(custom_sort)