我的文件列表为listOfCELfiles
listOfCELfiles <- c(
"GSE20489/GSE20489_RAW//GSM514737.CEL.gz",
"GSE20489/GSE20489_RAW//GSM514738.CEL.gz",
"GSE20489/GSE20489_RAW//GSM514739.CEL.gz",
"GSE20489/GSE20489_RAW//GSM514740.CEL.gz",
"GSE20489/GSE20489_RAW//GSM514741.CEL.gz",
"GSE20489/GSE20489_RAW//GSM514742.CEL.gz",
"GSE20489/GSE20489_RAW//GSM514743.CEL.gz",
"GSE20489/GSE20489_RAW//GSM514744.CEL.gz",
"GSE20489/GSE20489_RAW//GSM514745.CEL.gz"
)
数据框为timepoint_table
timepoint_table <- tibble(SampleID = c("GSM514737","GSM514738","GSM514739","GSM514740","GSM514741","GSM514742","GSM514743","GSM514744","GSM514745"),
SampleName = c("Blood_alcohol_T1_S13", "Blood_alcohol_T2_S13","Blood_OJalcohol_T3_S13","Blood_alcohol_T4_S13","Blood_OJalcohol_T5_S13","Blood_alcohol_T1_S15","Blood_alcohol_T2_S15","Blood_OJalcohol_T3_S15","Blood_OJalcohol_T4_S15"))
所以timepoint_table
看起来像这样:
# A tibble: 9 x 2
SampleID SampleName
<chr> <chr>
1 GSM514737 Blood_alcohol_T1_S13
2 GSM514738 Blood_alcohol_T2_S13
3 GSM514739 Blood_OJalcohol_T3_S13
4 GSM514740 Blood_alcohol_T4_S13
5 GSM514741 Blood_OJalcohol_T5_S13
6 GSM514742 Blood_alcohol_T1_S15
7 GSM514743 Blood_alcohol_T2_S15
8 GSM514744 Blood_OJalcohol_T3_S15
9 GSM514745 Blood_OJalcohol_T4_S15
SampleID
是listOfCELfiles
中文件名的一部分,现在我想通过与{{1}进行匹配,从Blood_alcohol
中删除除listOfCELfiles
之外的所有示例。 }的timepoint_table
。以下代码从SampleName
中选择匹配的SampleID
timepoint_table
但是我无法用匹配的timepoint_table %>%
filter(str_detect(SampleName, "^Blood_alcohol")) %>%
select(SampleID)
(使用listOfCELfiles
或SampleID
)来过滤grepl
。
我的预期输出将是一个包含以下内容的列表:
str_detect
答案 0 :(得分:0)
您可以获取匹配的ID,然后使用grep
library(tidyverse)
ids <- timepoint_table %>%
filter(str_detect(SampleName, "^Blood_alcohol")) %>%
pull(SampleID)
grep(paste0(ids, collapse = "|"), listOfCELfiles, value = TRUE)
#[1] "GSE20489/GSE20489_RAW//GSM514737.CEL.gz" "GSE20489/GSE20489_RAW//GSM514738.CEL.gz"
#[3] "GSE20489/GSE20489_RAW//GSM514740.CEL.gz" "GSE20489/GSE20489_RAW//GSM514742.CEL.gz"
#[5] "GSE20489/GSE20489_RAW//GSM514743.CEL.gz"