Question

我尝试使用noiseqbio包中的NOISeq函数过滤掉我的RNASeq数据的低计数功能，然后运行WGCNA包来构建一个监管网络。但是当我尝试这样做时，我收到了这个错误。任何人都可以帮我解决这个问题吗？

# rpkm = matrix with more than 9,000 genes and 7 conditions (2 biological replicates)

rpkm<-read.csv("rpkm_all.csv")

head(rpkm)                 

                  F24h_1      F24h_2       C6h_1        ....
e_gw1.1.1022.1 10.6933092  8.91526912  7.24161321       ....
e_gw1.1.104.1   0.0000000  0.02118639  0.02090429       ....
e_gw1.1.1046.1  0.1131807  0.15213278  0.16165381      ....

myfactors=data.frame(condicao=c("F24h","F24h","C6h","C6h","C12h","C12h","C24h","C24h","B6h","B6h","B12h","B12h","B24h","B24h"),replicas= c("F24h_1","F24h_2","C6h_1","C6h_2","C12h_1","C12h_2","C24h_1","C24h_2","B6h_1","B6h_2","B12h_1","B12h_2","B24h_1","B24h_2"))

head(myfactors)
  condicao replicas
1     F24h   F24h_1
2     F24h   F24h_2
3      C6h    C6h_1
4      C6h    C6h_2
5     C12h   C12h_1
6     C12h   C12h_2

mydata<-readData(data=rpkm, factors=myfactors,length = NULL,biotype = NULL,chromosome = NULL,gc = NULL)

mydata

ExpressionSet (storageMode: lockedEnvironment)
assayData: 9852 features, 14 samples
  element names: exprs
protocolData: none
phenoData
  sampleNames: F24h_1 F24h_2 ... B24h_2 (14
    total)
  varLabels: condicao replicas
  varMetadata: labelDescription
featureData: none
experimentData: use 'experimentData(object)'
Annotation:

mynoiseqbio=noiseqbio(mydata,k=0.5,norm="rpkm",factor=myfactors$condicao, lc=0, r=50, =1.5, plot=TRUE, a0per=0.9, random.seed=12345,filter=1)

这是错误：

Error in `[.data.frame`(input@phenoData@data, , factor) :
  undefined columns selected

Answer 1

factor=函数中的noiseqbio()参数需要一个字符串值，但您提供的内容似乎是一个因素。使用data.frame()构造字符串列会将字符串视为因子级别。要解决此问题，请将列值转换为字符串：

mynoiseqbio <- noiseqbio(mydata, ..., factor=as.character(myfactors$condicao), ...)

这将确保factor=参数获得预期的值。

另外，请确保condicao中的值与rpkm数据框中的实际列名相匹配。

Answer 2

Alex, I modified my variables according with this script below wrote by @komal.rathi and it has worked for me. Thank you all both for the suport.

rpkm <- matrix(rnorm(137928),9852,14) # replicate data
colnames(rpkm<-c("F24h_1","F24h_2","C6h_1","C6h_2","C12h_1","C12h_2","C24h_1","C24h_2","B6h_1","B6h_2","B12h_1","B12h_2","B24h_1","B24h_2")

myfactors <- data.frame(condicao = c("F24h","F24h","C6h","C6h","C12h","C12h","C24h","C24h","B6h","B6h","B12h","B12h","B24h","B24h"),
                     replicas = c("F24h_1","F24h_2","C6h_1","C6h_2","C12h_1","C12h_2","C24h_1","C24h_2","B6h_1","B6h_2","B12h_1","B12h_2","B24h_1","B24h_2"))

mydata <- readData(data = rpkm, 
                 factors = myfactors,
                 length = NULL,
                 biotype = NULL,
                 chromosome = NULL,
                 gc = NULL)

mynoiseqbio <- noiseqbio(input = mydata, k = 0.5, norm = "rpkm", 
                      factor = "condicao", conditions = c('F24h','C6h'), 
                      lc = 0, r = 50, adj = 1.5, plot = TRUE, a0per = 0.9, 
                      random.seed = 12345, filter = 1)

noiseqbio中的错误 - 过滤掉低计数

2 个答案: