我想用快速文件格式绘制DNA序列列表的组成。我有以下代码,但它产生了一个错误,我似乎无法弄清楚问题所在。
library(ggplot2)
library(reshape2)
library(gridExtra)
library(seqinr)
#read sequence from file
seqs <- read.fasta("seqs.fasta")
total_seqs <- length(seqs)
proportion <- function(x,t=sum(x)){ x/t}
normalize <- function(x,m=mean(x,...),s=sd(x,...),...){(x-m)/s}
mylist <- list()
for (i in 1:total_seqs){
mylist[[i]] <- (count(seqs[[i]],1))
}
composition.dta <- data.frame(do.call("rbind",mylist))
#get the propotions
composition.dta.prop <- t(data.frame(apply(composition.dta,1,proportion)))
#melt this dataframe
composition.dta.prop.m <- melt(composition.dta.prop,value.name="proportion")
#do the plot
composition.plot <- ggplot(composition.dta.prop.m,aes(Var2,proportion)) +
geom_point() +
geom_boxplot() +
theme_classic()
grid.arrange(composition.plot)
这导致此错误:
## Error in ‘rownames<-‘(‘*tmp*‘, value = c("1.5", "1.9", "1.33",
"1.21", : length of ’dimnames’ [1] not equal to array extent
## Error in eval(expr, envir, enclos): object ’seqs’ not found
## Error in eval(expr, envir, enclos): object ’total_seqs’ not
found
任何想法会导致什么结果?
谢谢。
样本DNA数据集在这里http://pastebin.com/vdygaqPv