根据内容进行分组,然后对新列进行突变

时间:2020-03-02 21:32:17

标签: r dplyr tidyverse

我试图找出一种将数据分组的方法,然后根据分组行的内容创建一列。

要处理的示例df

df <- tibble::tribble(
              ~name, ~position, ~G,
      "DJ LeMahieu",      "1B", 40,
      "DJ LeMahieu",      "2B", 75,
      "DJ LeMahieu",      "3B", 52,
        "Max Muncy",      "1B", 65,
        "Max Muncy",      "2B", 70,
        "Max Muncy",      "3B", 35,
  "Whit Merrifield",      "2B", 82,
  "Whit Merrifield",      "OF", 61
  )

然后我希望将此内容按名称级别分组。我想创建一个称为extra_position的新列。此列是由“ /”分隔的位置列中内容的串联。下面的示例输出:

output_df <- tibble::tribble(
              ~name,  ~extra_position,
      "DJ LeMahieu", "1B/2B/3B",
        "Max Muncy", "1B/2B/3B",
  "Whit Merrifield",    "2B/OF"
  )

如果可能的话,我想留在tidyverse之内。另外,我很好奇您是否也可以控制串联数据的顺序。例如,您可以使DJ LeMahieu的extra_position内容显示为:"3B/2B/1B"吗?

1 个答案:

答案 0 :(得分:1)

我们可以通过paste将元素分为单个字符串来按“名称”,str_c或(collapse)对“位置”列进行分组

library(dplyr)
library(stringr)
df %>%
    group_by(name) %>% 
    summarise(extra_position = str_c(position, collapse="/"))

如果我们需要rev修改订单

df %>% 
    group_by(name) %>% 
    summarise(position = str_c(rev(position), collapse="/"))

或者如果它基于值

df %>% 
    group_by(name) %>%
    summarise(position = str_c(gtools::mixedsort(position,
            decreasing = TRUE), collapse="/"))

或与data.table

library(data.table)
setDT(df)[, .(extra_position = paste(position, collapse="/")), .(name)]

base R中,使用aggregate

aggregate(position ~ name, df, paste, collapse="/")