我想读取一个包含多个dna序列的fasta文件,将其转录为rna并以fasta格式写入结果,这里是代码:
library("Biostrings")
dna <- readDNAStringSet("D:/R/smpl.fasta")
rna <- RNAStringSet(complement(dna))
rna
我明白了:
A RNAStringSet instance of length 2
width seq names
[1] 742 GGCGGGGAGACGG...CAAAAGUCUUUAU gi|568815581:4168...
[2] 910 CACGGUCGACACA...UCUCGAGUUUCUC gi|568815581:4168...
我在一个文件中输入:
write.fasta(sequences = as.list(rna), names = names(rna), nbchar = 60, file.out = "D:/R/rna.fasta")
结果(rna.fasta
):
>gi|568815581:41688875-41691646 Homo sapiens chromosome 17, GRCh38.p2 Primary Assembly
GGCGGGGAGACGGGGUC...
>gi|568815581:41688875-41691646 Homo sapiens chromosome 18, GRCh38.p2 Primary Assembly
CACGGUCGACACAACAU...
如何在每个序列结束后添加空白行?结果如下:
>gi|568815581:41688875-41691646 Homo sapiens chromosome 17, GRCh38.p2 Primary Assembly
GGCGGGGAGACGGGGUC...
>gi|568815581:41688875-41691646 Homo sapiens chromosome 18, GRCh38.p2 Primary Assembly
CACGGUCGACACAACAU...
答案 0 :(得分:0)
我相信这会做你想要的,除了“空白”行以>
开头,后跟一个空行。如果使用write.fasta
,我认为你不能避免这种情况。
# Example data from documentation
library("Biostrings")
filepath <- system.file("extdata", "someORF.fa", package="Biostrings")
dna <- readDNAStringSet(filepath)
rna <- RNAStringSet(complement(dna))
# Create a emtpy list to hold the results
rna_out <- vector("list", length(rna)*2)
use <- (1:length(rna_out)) %% 2 # the elements to which we'll put data
cnt <- 1 # separate counter into rna_out
# Fill the list
for (i in 1:length(rna_out)) {
if (as.logical(use[i])) {
rna_out[i] <- as.character(rna[cnt])
names(rna_out)[i] <- names(rna)[cnt]
cnt <- cnt + 1}
if (!as.logical(use[i])) {
rna_out[i] <- ""
names(rna_out)[i] <- ""
}
}
library("seqinr") # need the write.fasta from this pkg
write.fasta(rna_out, names = names(rna_out), file.out = "rna.txt")
如果您不希望这样,我有一个版本,使用相同的信息写明文,但没有>
。我想这取决于你为什么要这些空行。