Question

我正在尝试使用dplyr::filter()函数过滤我的tibble的特定行。

以下是我的tibble head(raw.tb)的一部分：

A tibble: 738 x 4
      geno   ind     X     Y
     <chr> <chr> <int> <int>
 1 san1w16    A1   467   383
 2 san1w16    A1   465   378
 3 san1w16    A1   464   378
 4 san1w16    A1   464   377
 5 san1w16    A1   464   376
 6 san1w16    A1   464   375
 7 san1w16    A1   463   375
 8 san1w16    A1   463   374
 9 san1w16    A1   463   373
10 san1w16    A1   463   372
# ... with 728 more rows

当我要求：raw.tb %>% dplyr::filter(ind == contains("A"))

时

我得到： Error in filter_impl(.data, quo) : Evaluation error: No tidyselect variables were registered

我的tibble unique(raw.tb$ind)是：

    [1] "A1"  "A10" "A11" "A12" "A2"  "A3"  "A4"  "A5"  "A6"  "A7"  "A8"  "A9"  "B1" 
[14] "B10" "B11" "B12" "B2"  "B3"  "B4"  "B5"  "B6"  "B7"  "B8"  "B9"  "C1"  "C10"
[27] "C11" "C12" "C2"  "C3"  "C4"  "C5"  "C6"  "C7"  "C8"  "C9"  "D1"  "D10" "D11"
[40] "D12" "D2"  "D3"  "D4"  "D5"  "D6"  "D7"  "D8"  "D9"  "E1"  "E10" "E11" "E12"
[53] "E2"  "E3"  "E4"  "E5"  "E6"  "E7"  "E8"  "E9"  "F1"  "F10" "F11" "F12" "F2" 
[66] "F3"  "F4"  "F5"  "F6"  "F7"  "F8"  "F9"  "G1"  "G10" "G11" "G2"  "G3"  "G4" 
[79] "G5"  "G6"  "G7"  "G8"  "G9"  "H1"  "H10" "H11"

我想只提取raw.tb$ind以＆＃34; A＆＃34;开头的行。使用tidyverse语言。

（我知道如何在基地R中这样做，但我的目标是使用tidyverse）。

非常感谢任何反馈

Answer 1

filter需要一个逻辑向量来过滤行。 select帮助器（?select_helpers）函数contains根据某种模式选择数据集的列。为了过滤行，我们可以使用grepl

中的base R

raw.tb %>%
   dplyr::filter(grepl("A", ind))

来自str_detect的{p>或stringr（tidyverse

中的一个包

raw.tb %>%
  dplyr::filter(stringr::str_detect(ind, "A"))

Answer 2

只需写出akrun's comment，@ krun就可以随意接管这个答案。

创建一些数据，

dput(raw.tb) 
raw.tb <- structure(list(geno = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L), .Label = "san1w16", class = "factor"), ind = structure(c(1L, 
1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 1L), .Label = c("A1", "B1", "C1", 
"D1", "E1"), class = "factor"), X = c(467L, 465L, 464L, 464L, 
464L, 464L, 463L, 463L, 463L, 463L), Y = c(383L, 378L, 378L, 
377L, 376L, 375L, 375L, 374L, 373L, 372L)), .Names = c("geno", 
"ind", "X", "Y"), row.names = c("1", "2", "3", "4", "5", "6", 
"7", "8", "9", "10"), class = c("tbl_df", "tbl", "data.frame"
))

数据，

raw.tb
#> # A tibble: 10 x 4
#>       geno    ind     X     Y
#>  *  <fctr> <fctr> <int> <int>
#>  1 san1w16     A1   467   383
#>  2 san1w16     A1   465   378
#>  3 san1w16     B1   464   378
#>  4 san1w16     B1   464   377
#>  5 san1w16     C1   464   376
#>  6 san1w16     C1   464   375
#>  7 san1w16     D1   463   375
#>  8 san1w16     D1   463   374
#>  9 san1w16     E1   463   373
#> 10 san1w16     A1   463   372

方法＃1

raw.tb %>% dplyr::filter(str_detect(ind, "A"))
#> # A tibble: 3 x 4
#>      geno    ind     X     Y
#>    <fctr> <fctr> <int> <int>
#> 1 san1w16     A1   467   383
#> 2 san1w16     A1   465   378
#> 3 san1w16     A1   463   372

方法＃1

raw.tb %>% dplyr::filter(grepl("A", ind))
#> # A tibble: 3 x 4
#>      geno    ind     X     Y
#>    <fctr> <fctr> <int> <int>
#> 1 san1w16     A1   467   383
#> 2 san1w16     A1   465   378
#> 3 san1w16     A1   463   372

dplyr :: filter＆＃34;没有注册tidyselect变量＆＃34;

2 个答案: