如何使用DESeq创建newCountDataSet

时间:2013-07-24 21:32:04

标签: r data-structures bioinformatics

我有一张桌子,这是开始:

TargetID         SM_H1462   SM_H1463    SM_K1566    SM_X1567    SM_V1568   SM_K1534     SM_K1570    SM_K1571    
ENSG00000000419.8   290 270 314 364 240 386 430 329     
ENSG00000000457.8   252 230 242 220 106 234 343 321 
ENSG00000000460.11  154 158 162 136 64  152 206  432
ENSG00000000938.7   20106   18664   19764   15640   19024   18508   45590   32113

我想使用DESeq包创建此表的newCountDataSet对象。

这是我的代码:

#First, define Control & Case so that condition can be defined later
#Here, the colnames are grouped into Control or Case based on their name (the SM_... ones)

my.df <- data.frame(matrix(rep(seq(1,8),3), ncol = 8))
colnames(my.df) <- c('SM_H1462','SM_H1463','SM_K1566','SM_X1567', 'SM_V1568', 'SM_K1534', 'SM_K1570','SM_K1571')
control = my.df[,(substr(colnames(my.df),4,4) == 'H' | substr(colnames(my.df),4,4) == 'K')]
case = my.df[,(substr(colnames(my.df),4,4) == 'X' | substr(colnames(my.df),4,4) == 'V')]

#Define condition
condition= c(control, case) 

cds1 = newCountDataSet(data, condition)

但是我得到了错误,我不知道如何解决它:

Error in sort.list(y) : 'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?

我认为这是因为条件必须是一个因素,而且它目前是24的清单。所以我试过

condition=factor(condition)

但是我得到了相同的错误消息。

1 个答案:

答案 0 :(得分:2)

我从未使用 DESeq ,但根据?newCountDataSet conditions,必须是长度等于countData中的列数的因素。以下应该有效:

condition <- factor(ifelse(substr(colnames(my.df),4,4) == 'H' | substr(colnames(my.df),4,4) == 'K', "control", "case"))
cds1 <- newCountDataSet(my.df, condition)