在fasta文件中插入一个空行

时间:2016-02-07 01:13:49

标签: r fasta

我想读取一个包含多个dna序列的fasta文件,将其转录为rna并以fasta格式写入结果,这里是代码:

library("Biostrings")
dna <- readDNAStringSet("D:/R/smpl.fasta")
rna <- RNAStringSet(complement(dna))
rna

我明白了:

  A RNAStringSet instance of length 2
    width seq                           names               
[1]   742 GGCGGGGAGACGG...CAAAAGUCUUUAU gi|568815581:4168...
[2]   910 CACGGUCGACACA...UCUCGAGUUUCUC gi|568815581:4168...

我在一个文件中输入:

write.fasta(sequences = as.list(rna), names = names(rna), nbchar = 60, file.out = "D:/R/rna.fasta")

结果(rna.fasta):

>gi|568815581:41688875-41691646 Homo sapiens chromosome 17, GRCh38.p2 Primary Assembly
GGCGGGGAGACGGGGUC...
>gi|568815581:41688875-41691646 Homo sapiens chromosome 18, GRCh38.p2 Primary Assembly
CACGGUCGACACAACAU...

如何在每个序列结束后添加空白行?结果如下:

>gi|568815581:41688875-41691646 Homo sapiens chromosome 17, GRCh38.p2 Primary Assembly
GGCGGGGAGACGGGGUC...

>gi|568815581:41688875-41691646 Homo sapiens chromosome 18, GRCh38.p2 Primary Assembly
CACGGUCGACACAACAU...

1 个答案:

答案 0 :(得分:0)

我相信这会做你想要的,除了“空白”行以>开头,后跟一个空行。如果使用write.fasta,我认为你不能避免这种情况。

# Example data from documentation
library("Biostrings")
filepath <- system.file("extdata", "someORF.fa", package="Biostrings")
dna <- readDNAStringSet(filepath)
rna <- RNAStringSet(complement(dna))

# Create a emtpy list to hold the results
rna_out <- vector("list", length(rna)*2)
use <- (1:length(rna_out)) %% 2 # the elements to which we'll put data
cnt <- 1 # separate counter into rna_out

# Fill the list
for (i in 1:length(rna_out)) {
    if (as.logical(use[i])) {
        rna_out[i] <- as.character(rna[cnt])
        names(rna_out)[i] <- names(rna)[cnt]
        cnt <- cnt + 1}
    if (!as.logical(use[i])) {
        rna_out[i] <- ""
        names(rna_out)[i] <- ""
        }
    }

library("seqinr") # need the write.fasta from this pkg
write.fasta(rna_out, names = names(rna_out), file.out = "rna.txt")

如果您不希望这样,我有一个版本,使用相同的信息写明文,但没有>。我想这取决于你为什么要这些空行。