鉴于此数据:
df=data.frame(
x1=c(2,0,0,NA,0,1,1,NA,0,1),
x2=c(3,2,NA,5,3,2,NA,NA,4,5),
x3=c(0,1,0,1,3,0,NA,NA,0,1),
x4=c(1,0,NA,3,0,0,NA,0,0,1),
x5=c(1,1,NA,1,3,4,NA,3,3,1))
我想使用dplyr为所选列的行方向最小值创建一个额外的列min
。使用列名称很容易:
df <- df %>% rowwise() %>% mutate(min = min(x2,x5))
但是我有一个带有不同列名的大型df,所以我需要从一些值mycols
中匹配它们。现在其他线程告诉我使用select辅助函数,但我必须遗漏一些东西。这是matches
:
mycols <- c("x2","x5")
df <- df %>% rowwise() %>%
mutate(min = min(select(matches(mycols))))
Error: is.string(match) is not TRUE
one_of
:
mycols <- c("x2","x5")
df <- df %>%
rowwise() %>%
mutate(min = min(select(one_of(mycols))))
Error: no applicable method for 'select' applied to an object of class "c('integer', 'numeric')"
In addition: Warning message:
In one_of(c("x2", "x5")) : Unknown variables: `x2`, `x5`
我在俯瞰什么? select_
应该有效吗?它不在下面:
df <- df %>%
rowwise() %>%
mutate(min = min(select_(mycols)))
Error: no applicable method for 'select_' applied to an object of class "character"
同样地:
df <- df %>%
rowwise() %>%
mutate(min = min(select_(matches(mycols))))
Error: is.string(match) is not TRUE
答案 0 :(得分:3)
这是另一个解决方案,有点技术性,有来自为函数式编程设计的tidyverse的purrr
包的帮助。
来自matches
的拳头dplyr
助手将正则表达式字符串作为参数而不是向量。这是一种很好的方法,可以找到匹配所有列的正则表达式。
(在你下面的代码中可以使用你想要的dplyr
选择助手)
然后,当您了解功能编程的基本方案时,purrr
函数与dplyr
的效果很好。
解决您的问题:
df=data.frame(
x1=c(2,0,0,NA,0,1,1,NA,0,1),
x2=c(3,2,NA,5,3,2,NA,NA,4,5),
x3=c(0,1,0,1,3,0,NA,NA,0,1),
x4=c(1,0,NA,3,0,0,NA,0,0,1),
x5=c(1,1,NA,1,3,4,NA,3,3,1))
# regex to get only x2 and x5 column
mycols <- "x[25]"
library(dplyr)
df %>%
mutate(min_x2_x5 =
# select columns that you want in df
select(., matches(mycols)) %>%
# use pmap on this subset to get a vector of min from each row.
# dataframe is a list so pmap works on each element of the list that is to say each row
purrr::pmap_dbl(min)
)
#> x1 x2 x3 x4 x5 min_x2_x5
#> 1 2 3 0 1 1 1
#> 2 0 2 1 0 1 1
#> 3 0 NA 0 NA NA NA
#> 4 NA 5 1 3 1 1
#> 5 0 3 3 0 3 3
#> 6 1 2 0 0 4 2
#> 7 1 NA NA NA NA NA
#> 8 NA NA NA 0 3 NA
#> 9 0 4 0 0 3 3
#> 10 1 5 1 1 1 1
我不会在这里进一步解释purrr
,但在你的情况下它可以正常工作
答案 1 :(得分:1)
这有点棘手。在SE评估的情况下,您需要将操作作为字符串传递。
mycols <- '(x2,x5)'
f <- paste0('min',mycols)
df %>% rowwise() %>% mutate_(min = f)
df
# A tibble: 10 × 6
# x1 x2 x3 x4 x5 min
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 2 3 0 1 1 1
#2 0 2 1 0 1 1
#3 0 NA 0 NA NA NA
#4 NA 5 1 3 1 1
#5 0 3 3 0 3 3
#6 1 2 0 0 4 2
#7 1 NA NA NA NA NA
#8 NA NA NA 0 3 NA
#9 0 4 0 0 3 3
#10 1 5 1 1 1 1