我在列中有一个参数作为AD。但这是每行不同的顺序。如何从X2中选择“ AD”。
X1 X2
GT:GQ:GQX:DPI:AD:DP 0/1:909:12:125:93,26:119
GT:GQ:GQX:DPI:AD 0/1:909:12:125:35,24
GT:GQ:GQX:DP:DPF:AD 0/1:57:3:11:130:8,3
GT:AD:DP:GQ:PL 0/1:211,31:242:99:138,0,7251
输出
AD
93,26
35,24
8,3
211,31
答案 0 :(得分:1)
使用":"
在strsplit
处拆分列,并选择"AD"
和grep
来确定mapply
的位置。
mapply(`[`, strsplit(d$X2, ":"), sapply(strsplit(d$X1,":"), grep, pattern="AD"))
# [1] "93,26" "35,24" "8,3" "211,31"
数据:
d <- structure(list(X1 = c("GT:GQ:GQX:DPI:AD:DP", "GT:GQ:GQX:DPI:AD",
"GT:GQ:GQX:DP:DPF:AD", "GT:AD:DP:GQ:PL"), X2 = c("0/1:909:12:125:93,26:119",
"0/1:909:12:125:35,24", "0/1:57:3:11:130:8,3", "0/1:211,31:242:99:138,0,7251"
)), class = "data.frame", row.names = c(NA, -4L))
答案 1 :(得分:1)
使用基础R
并拆分以提取“ AD”元素。
mapply(
function(x, i) x[i],
strsplit(df$X2, ":"),
lapply(strsplit(df$X1, ":"), function(x) which(x == "AD"))
)
[1] "93,26" "35,24" "8,3" "211,31"
可复制的数据
df <- data.frame(
X1 = c("GT:GQ:GQX:DPI:AD:DP", "GT:GQ:GQX:DPI:AD", "GT:GQ:GQX:DP:DPF:AD", "GT:AD:DP:GQ:PL"),
X2 = c("0/1:909:12:125:93,26:119", "0/1:909:12:125:35,24", "0/1:57:3:11:130:8,3", "0/1:211,31:242:99:138,0,7251")
)
答案 2 :(得分:1)
在使用基数R时,也许您可以尝试regmatches
+ regexpr
> unlist(regmatches(df$X2,regexpr("\\d+,\\d+",df$X2)))
[1] "93,26" "35,24" "8,3" "211,31"