我想为我的数据框MirAligner
创建新的合欢名,其中包含原始合并名中第一个_
之前的部分。这就是我试过的:
unlist(strsplit(as.character(colnames(MirAligner)),'_',fixed=TRUE))
列名
head(colnames(MirAligner))
[1] "na-008_S52_L003_R1_001.mir.fa.gz" "na-014_S99_L005_R1_001.mir.fa.gz" "na015_S114_L005_R1_001.mir.fa.gz" [4] "na-015_S50_L003_R1_001.mir.fa.gz" "na-018_S147_L007_R1_001.mir.fa.gz" "na020_S162_L007_R1_001.mir.fa.gz"
预期产出:
na-008 na-014 na015
答案 0 :(得分:5)
我们可以使用sub
sub('_.*', '', str1)
#[1] "na-014" "na015" "na-015" "na-018" "na020"
str1 <- c("na-014_S99_L005_R1_001.mir.fa.gz",
"na015_S114_L005_R1_001.mir.fa.gz",
"na-015_S50_L003_R1_001.mir.fa.gz",
"na-018_S147_L007_R1_001.mir.fa.gz",
"na020_S162_L007_R1_001.mir.fa.gz")
答案 1 :(得分:3)
gsub("^(.*?)_.*", "\\1", try5)
#[1] "na-008" "na-014" "na015"
答案 2 :(得分:3)
在 sapply 中使用 strsplit :
#myColNames <- colnames(MirAligner)
myColNames <- c("na-008_S52_L003_R1_001.mir.fa.gz", "na-014_S99_L005_R1_001.mir.fa.gz")
sapply(strsplit(myColNames, "_", fixed = TRUE), "[[", 1)
#output
# [1] "na-008" "na-014"
或使用 read.table :
read.table(text = myColNames, sep = "_", stringsAsFactors = FALSE)[, "V1"]