以给定字符串结尾的列的Pmax

时间:2019-04-23 20:28:49

标签: r dplyr tidyr

我想对给定行有条件地突变一个新列,该列代表以{_n“结尾的列中的pmax()。我知道我可以通过显式指定列名来做到这一点,但我希望这是调用ends_with()或类似名称的结果。

我尝试过mutate_at()和普通mutate()。我的一般想法是,我需要将vars(ends_with("_n"))传递给某物,但我只是缺少该物。

谢谢。

library(dplyr)
library(tidyr)

mtcars %>%
  group_by(vs, gear) %>% 
  summarize(mean = mean(disp),
            sd = sd(disp),
            n = n()) %>% 
  mutate_if(is.double, round, 1) %>% 
  mutate(mean_sd = paste0(mean, " (", sd, ")")) %>% 
  select(-mean, -sd) %>%
  group_by(vs, gear) %>% 
  nest(n, mean_sd, .key = "summary") %>% 
  spread(key = vs, value = summary) %>% 
  unnest(`0`, `1`, .sep = "_")
   gear `0_n` `0_mean_sd`   `1_n` `1_mean_sd` 
  <dbl> <int> <chr>         <int> <chr>       
1     3    12 357.6 (71.8)      3 201 (72)    
2     4     2 160 (0)          10 115.6 (38.5)
3     5     4 229.3 (113.9)     1 95.1 (NA)   

编辑:两个答案都值得赞赏。干杯!

2 个答案:

答案 0 :(得分:2)

这是使用unquote-splice运算符的一种方法。我们可以将要比较的select列,然后将它们作为向量拼接到pmax中:

library(tidyverse)
tbl <- structure(list(gear = c(3, 4, 5), `0_n` = c(12L, 2L, 4L), `0_mean_sd` = c("357.6 (71.8)", "160 (0)", "229.3 (113.9)"), `1_n` = c(3L, 10L, 1L), `1_mean_sd` = c("201 (72)", "115.6 (38.5)", "95.1 (NA)")), row.names = c(NA, -3L), class = c("tbl_df", "tbl", "data.frame"))
tbl %>%
  mutate(pmax = pmax(!!!select(., ends_with("_n"))))
#> # A tibble: 3 x 6
#>    gear `0_n` `0_mean_sd`   `1_n` `1_mean_sd`   pmax
#>   <dbl> <int> <chr>         <int> <chr>        <int>
#> 1     3    12 357.6 (71.8)      3 201 (72)        12
#> 2     4     2 160 (0)          10 115.6 (38.5)    10
#> 3     5     4 229.3 (113.9)     1 95.1 (NA)        4

reprex package(v0.2.1)于2019-04-23创建

答案 1 :(得分:1)

一个基本的R版本,作为替代:

tbl <- structure(list(gear = c(3, 4, 5), `0_n` = c(12L, 2L, 4L), `0_mean_sd` = c("357.6 (71.8)", "160 (0)", "229.3 (113.9)"), `1_n` = c(3L, 10L, 1L), `1_mean_sd` = c("201 (72)", "115.6 (38.5)", "95.1 (NA)")), row.names = c(NA, -3L), class = c("tbl_df", "tbl", "data.frame"))
tbl$pmax <- do.call(pmax,as.list(dat[,grepl("_n$",names(dat))]))