如何制作三元组列并对其应用函数

时间:2016-02-06 05:09:41

标签: r

我有一个mycols的数据框。您可以看到mycols包含^SQN^WES^WGS列集。我想以同一顺序创建SQNWESWGS列的三元组(我不想包含(Pile:up)$:AD$列)可以看到mycols的每个集合的SQN,WES和WGS的扩展名是相同的。换句话说,我想制作一组具有相同扩展名的SQN,WES和WGS。然后我有一个名为myfunc的函数。我想将该函数应用于由此形成的每个三元组。

mycols<- c("SQN:IDH2:G515T:R172M","WES:IDH2:G515T:R172M"    ,"WES:IDH2:G515T:R172M:AD:(Pile:up)", "WGS:IDH2:G515T:R172M","SQN:JAK1:A1432T:T478S",   "WES:JAK1:A1432T:T478S" ,"WES:JAK1:A1432T:T478S:AD:(pile:up)","WGS:JAK1:A1432T:T478S","SQN:JAK1:T1868C:V623A","WES:JAK1:T1868C:V623A","WES:JAK1:T1868C:V623A:AD","WES:JAK1:T1868C:V623A:AD:(Pile:up)",  "WGS:JAK1:T1868C:V623A")

结果:

triplet1
"SQN:IDH2:G515T:R172M",   "WES:IDH2:G515T:R172M", "WGS:IDH2:G515T:R172M" 
triplet2
"SQN:JAK1:A1432T:T478S","WES:JAK1:A1432T:T478S","WGS:JAK1:A1432T:T478S",
triplet3
"SQN:JAK1:T1868C:V623A","WES:JAK1:T1868C:V623A","WGS:JAK1:T1868C:V623A"

所以我可以简单地将我的功能称为triplet1,triple 2,triplet3 ......

1 个答案:

答案 0 :(得分:1)

我们可以获得不具备的字符串的逻辑索引&#39; P(p)ile:up&#39;或者&#39; AD&#39; (最后)grepl。子集&#39; mycols&#39;与&#39; i1&#39;。通过删除以字母字符开头的前缀部分(包括第一个sub,然后:&#39; mycols1&#39;来创建使用split的分组变量。

i1 <- !grepl('(?i)(P)ile|AD$', mycols)
mycols1 <- mycols[i1]
split(mycols1, sub('[^:]+:', '', mycols1))
#$`IDH2:G515T:R172M`
#[1] "SQN:IDH2:G515T:R172M" "WES:IDH2:G515T:R172M" "WGS:IDH2:G515T:R172M"

#$`JAK1:A1432T:T478S`
#[1] "SQN:JAK1:A1432T:T478S" "WES:JAK1:A1432T:T478S" "WGS:JAK1:A1432T:T478S"

#$`JAK1:T1868C:V623A`
#[1] "SQN:JAK1:T1868C:V623A" "WES:JAK1:T1868C:V623A" "WGS:JAK1:T1868C:V623A"