我有一个mycols
的数据框。您可以看到mycols包含^SQN
,^WES
和^WGS
列集。我想以同一顺序创建SQN
,WES
和WGS
列的三元组(我不想包含(Pile:up)$
和:AD$
列)可以看到mycols
的每个集合的SQN,WES和WGS的扩展名是相同的。换句话说,我想制作一组具有相同扩展名的SQN,WES和WGS。然后我有一个名为myfunc
的函数。我想将该函数应用于由此形成的每个三元组。
mycols<- c("SQN:IDH2:G515T:R172M","WES:IDH2:G515T:R172M" ,"WES:IDH2:G515T:R172M:AD:(Pile:up)", "WGS:IDH2:G515T:R172M","SQN:JAK1:A1432T:T478S", "WES:JAK1:A1432T:T478S" ,"WES:JAK1:A1432T:T478S:AD:(pile:up)","WGS:JAK1:A1432T:T478S","SQN:JAK1:T1868C:V623A","WES:JAK1:T1868C:V623A","WES:JAK1:T1868C:V623A:AD","WES:JAK1:T1868C:V623A:AD:(Pile:up)", "WGS:JAK1:T1868C:V623A")
结果:
triplet1
"SQN:IDH2:G515T:R172M", "WES:IDH2:G515T:R172M", "WGS:IDH2:G515T:R172M"
triplet2
"SQN:JAK1:A1432T:T478S","WES:JAK1:A1432T:T478S","WGS:JAK1:A1432T:T478S",
triplet3
"SQN:JAK1:T1868C:V623A","WES:JAK1:T1868C:V623A","WGS:JAK1:T1868C:V623A"
所以我可以简单地将我的功能称为triplet1,triple 2,triplet3 ......
答案 0 :(得分:1)
我们可以获得不具备的字符串的逻辑索引&#39; P(p)ile:up&#39;或者&#39; AD&#39; (最后)grepl
。子集&#39; mycols&#39;与&#39; i1&#39;。通过删除以字母字符开头的前缀部分(包括第一个sub
,然后:
&#39; mycols1&#39;来创建使用split
的分组变量。
i1 <- !grepl('(?i)(P)ile|AD$', mycols)
mycols1 <- mycols[i1]
split(mycols1, sub('[^:]+:', '', mycols1))
#$`IDH2:G515T:R172M`
#[1] "SQN:IDH2:G515T:R172M" "WES:IDH2:G515T:R172M" "WGS:IDH2:G515T:R172M"
#$`JAK1:A1432T:T478S`
#[1] "SQN:JAK1:A1432T:T478S" "WES:JAK1:A1432T:T478S" "WGS:JAK1:A1432T:T478S"
#$`JAK1:T1868C:V623A`
#[1] "SQN:JAK1:T1868C:V623A" "WES:JAK1:T1868C:V623A" "WGS:JAK1:T1868C:V623A"