枢轴不再带有名称后缀吗?

时间:2019-04-22 16:27:17

标签: r tidyr

到目前为止,我真的很喜欢使用pivot_longer。有没有办法将我的列的后缀用作pivot_longer命令的一部分?该函数具有一个names_prefix参数,但似乎不允许您使用后缀。

data <- tibble::tribble(
   ~last_name, ~first_name, ~pitcher, ~ff_avg_spin, ~si_avg_spin, ~fc_avg_spin, ~sl_avg_spin, ~ch_avg_spin, ~cu_avg_spin, ~fs_avg_spin,
      "Bauer",    "Trevor",   545333,         2286,         2276,         2539,         2687,         1441,         2464,           NA,
      "Rodon",    "Carlos",   607074,         2148,         2211,         2153,         2465,         1725,         2457,         2630,
  "Verlander",    "Justin",   434378,         2583,           NA,         2595,         2626,         1870,         2796,           NA
  )


data_long <- data %>% 
  pivot_longer(
    cols = contains("spin"), 
    names_to = "pitch_type",
    values_to = "avg_spin",
    values_drop_na = TRUE
  )

如何使用pitch_type列仅列出_avg_spin之前的文本?即ff,si,fc等。理想情况下,我希望该文本大写,但我可以使用pivot_longer

之后通过管道传递的mutate来解决该问题。

2 个答案:

答案 0 :(得分:0)

如果我对您的问题有不正确的理解,答案是使用pivot_longer_spec

library(dplyr)
library(tidyr)
library(magrittr)
data %T>% 
  {nspec <<- build_longer_spec(.,
    cols = contains("spin"), 
    names_to = "pitch_type",
    values_to = "avg_spin"
  ) %>%
    mutate(pitch_type = sub("_spin", "", pitch_type))} %>%
  pivot_longer_spec(spec = nspec, values_drop_na = TRUE)
  

答案 1 :(得分:0)

您可以使用names_pattern = "(.*)_avg_spin"除去后缀"_avg_spin"

data <- tibble::tribble(
   ~last_name, ~first_name, ~pitcher, ~ff_avg_spin, ~si_avg_spin, ~fc_avg_spin, ~sl_avg_spin, ~ch_avg_spin, ~cu_avg_spin, ~fs_avg_spin,
      "Bauer",    "Trevor",   545333,         2286,         2276,         2539,         2687,         1441,         2464,           NA,
      "Rodon",    "Carlos",   607074,         2148,         2211,         2153,         2465,         1725,         2457,         2630,
  "Verlander",    "Justin",   434378,         2583,           NA,         2595,         2626,         1870,         2796,           NA
  )


data %>% 
  pivot_longer(
    cols = contains("spin"), 
    names_to = "pitch_type",
    values_to = "avg_spin",
    values_drop_na = TRUE,
    names_pattern = "(.*)_avg_spin"
  )
#> # A tibble: 18 x 5
#>    last_name first_name pitcher pitch_type avg_spin
#>    <chr>     <chr>        <dbl> <chr>         <dbl>
#>  1 Bauer     Trevor      545333 ff             2286
#>  2 Bauer     Trevor      545333 si             2276
#>  3 Bauer     Trevor      545333 fc             2539
#>  4 Bauer     Trevor      545333 sl             2687
#>  5 Bauer     Trevor      545333 ch             1441
#>  6 Bauer     Trevor      545333 cu             2464
#>  7 Rodon     Carlos      607074 ff             2148
#>  8 Rodon     Carlos      607074 si             2211
#>  9 Rodon     Carlos      607074 fc             2153
#> 10 Rodon     Carlos      607074 sl             2465
#> 11 Rodon     Carlos      607074 ch             1725
#> 12 Rodon     Carlos      607074 cu             2457
#> 13 Rodon     Carlos      607074 fs             2630
#> 14 Verlander Justin      434378 ff             2583
#> 15 Verlander Justin      434378 fc             2595
#> 16 Verlander Justin      434378 sl             2626
#> 17 Verlander Justin      434378 ch             1870
#> 18 Verlander Justin      434378 cu             2796