Question

使用dplyr通过各种辅助功能（例如library(dplyr) ## add a1 after a df %>% mutate(a1 = sum(a + 1)) %>% select(a, a1, everything()) #> a a1 b c d #> 1 10 63 1 4 7 #> 2 20 63 2 5 8 #> 3 30 63 3 6 9 ## add a1 after c df %>% mutate(a1 = sum(a + 1)) %>% select(1:c, a1, everything()) #> a b c a1 d #> 1 10 1 4 63 7 #> 2 20 2 5 63 8 #> 3 30 3 6 63 9）选择列很简单。在这些功能的帮助文件中，该参数称为“文字字符串”。但是，可以改用正则表达式吗？

以下示例有效：

contains()

以下正则表达式示例不是：

library(dplyr)
iris %>%
   select(contains("Species"))

我想知道是否可以在dplyr select select helper函数中使用正则表达式，如果可以，则可以实现它们。

如果这不可能，那么我将使用替代方法（例如base或data.table）进行回答。对于背景，我的最终目标是使用# Select all column names that end with lower case "s" iris %>% select(contains("s$")) # Not run data frame with 0 columns and 150 rows函数或等效函数求和以数字结尾的所有列（即regexp summarise_at()）。

Answer 1

select helper函数matches()可用于匹配正则表达式：

library(dplyr)

out <- select(iris, matches("s$"))

head(out)
#>   Species
#> 1  setosa
#> 2  setosa
#> 3  setosa
#> 4  setosa
#> 5  setosa
#> 6  setosa

Answer 2

使用dplyr，可以使用ends_with：

iris %>% 
  select(ends_with("s")) %>% 
   head(3)
  Species
1  setosa
2  setosa
3  setosa

使用base和grepl：

head(iris[grepl("s$",names(iris),ignore.case = FALSE)])
  Species
1  setosa
2  setosa
3  setosa
4  setosa
5  setosa
6  setosa

或使用purrr：

iris %>% 
   purrr::keep(grepl("s$",names(.))) %>% 
   head()
  Species
1  setosa
2  setosa
3  setosa
4  setosa
5  setosa
6  setosa

Answer 3

我们还可以使用endsWith中的base R

subset(iris, select = endsWith(names(iris), "s"))

如何在dplyr的select辅助函数中使用正则表达式

3 个答案: