如何将字符串分成3个碱基长?

时间:2018-05-18 08:35:18

标签: r

我有一些快速文件中的DNA序列数据。如何将字符串分成3个碱基长? 我有这样的代码

Range

但它会导致

codons <- strsplit(k, "(?<=.{3})", perl=T)

我该如何调整?谢谢 例如

Error in .local(pattern, subject, max.mismatch, min.mismatch, with.indels,  : unused argument (perl = TRUE)

1 个答案:

答案 0 :(得分:3)

你几乎就在那里,strsplit不能使用环视,但你可以在gsub中使用它们在密码子之间放置一个特定的字符(例如“_”)然后用它来分割:

strsplit(gsub("(.{3})", "\\1_", k, perl=TRUE), "_")
#[[1]]
# [1] "GTA" "ATT" "TTG" "GTT" "TCA" "ATT" "TCA" "ATT" "TCC" "CGA" "CCA" "CTT" "CTC" "AAT" "ATT" "CCA" "ACA" "GAT" "TTC" "ATC" "CAT" "TGC" "CAG"