成对比较还是Anova比较?

时间:2019-05-20 07:37:58

标签: r

我有一个小组治疗,其中有2种治疗方法和一个对照,并且我使用EdgeR进行了DE。我的设计矩阵如下,并使用成对和方差分析对这些样品进行对比。尽管其DEg基因的LigFC相同,但其FDR不同。

data
           Ensembl_ID Gene_name    H9_wt_B1  H9_wt_B2   H9_KO_B1     H9_KO_B2   H9_KI_B1       H9_KI_B2 
## 1 ENSG00000000003    TSPAN6     4274     4233          3755          4937       4681          4061  
## 2 ENSG00000000005      TNMD      116      117           106           154       105            86     
## 3 ENSG00000000419      DPM1     2443     2391          2389          2597       3026          2453


design
##                  samples
## H9_wt_B1           H9_wt
## H9_wt_B2           H9_wt
## H9_KO_B1           H9_KO
## H9_KO_B2           H9_KO
## H9_KI_B1           H9_KI
## H9_KI_B2           H9_KI

all(rownames(design) %in% colnames(data))
## [1] TRUE
all(rownames(design) == colnames(data))
## [1] TRUE

design.matrix_H9 <- model.matrix(~0+samples,design)
design.matrix_H9
##               samplesH9_KI samplesH9_KO samplesH9_wt
## H9_wt_B1                 0                 0            1
## H9_wt_B2                 0                 0            1
## H9_KO_B1                 0                 1            0
## H9_KO_B2                 0                 1            0
## H9_KI_B1                 1                 0            0
## H9_KI_B2                 1                 0            0
## attr(,"assign")
## [1] 1 1 1
## attr(,"contrasts")
## attr(,"contrasts")$samples
## [1] "contr.treatment"


H9_list <- DGEList(counts = data[c(3:8)], group=design$samples, genes=data[1:2])
H9_filter <- filterByExpr(H9_list)
H9_keep<-H9_list[H9_filter, , keep.lib.sizes=FALSE]
H9_keep_norm <- calcNormFactors(H9_keep)
H9_dispersion<- estimateDisp(H9_keep_norm, design.matrix_H9, robust=TRUE)
H9_fited <- glmQLFit(H9_dispersion, design.matrix_H9, robust=TRUE)




contrasts_H9 <- makeContrasts(H9.KOvswt = samplesH9_KO - samplesH9_wt, 
                          H9.KIvswt = samplesH9_KI - samplesH9_wt, 
                          H9.KIvsKO = samplesH9_KI - samplesH9_KO, levels=design.matrix_H9)
contrasts_H9
##                    Contrasts
## Levels              H9.KOvswt H9.KIvswt H9.KIvsKO
##   samplesH9_KI         0         1         1
##   samplesH9_KO         1         0        -1
##   samplesH9_wt        -1        -1         0


qlf_H9.contrasts <- glmQLFTest(H9_fited, contrast=contrasts_H9)
topTags(qlf_H9.contrasts)
## Coefficient:  LR test on 2 degrees of freedom 
##            Ensembl_ID Gene_name logFC.H9.KOvswt logFC.H9.KIvswt
 ## 16448 ENSG00000187325     TAF9B     0.047290741      -8.5215156
## 2694  ENSG00000102710   SUPT20H    -0.009879572      -9.8090108
## 6131  ENSG00000129317     PUS7L     0.071584025     -13.0416646
## 7625  ENSG00000138161     CUZD1     1.072742182      -3.5320786
## 17959 ENSG00000198934    MAGEE1   -10.949547315       0.9180734

##       logFC.H9.KIvsKO   logCPM        F       PValue          FDR
## 16448       -8.568806 5.044505 955.5740 1.059827e-11 1.005574e-07
## 2694        -9.799131 5.032961 949.3324 1.093252e-11 1.005574e-07
## 6131       -13.113249 4.246053 834.5035 1.670160e-10 1.024142e-06
## 7625        -4.604821 3.376219 261.1062 4.800669e-09 2.207828e-05
## 17959       11.867621 2.659453 347.6069 6.771092e-09 2.491220e-05


 qlf_H9.KOvswt <- glmQLFTest(H9_fited, contrast=contrasts_H9[,"H9.KOvswt"])
 topTags(qlf_H9.KOvswt)
## Coefficient:  1*samplesH9_KO -1*samplesH9_wt 
##            Ensembl_ID  Gene_name      logFC    logCPM        F
## 50201 ENSG00000268658  LINC00664  -8.590411 2.3669917 404.3259
## 17959 ENSG00000198934     MAGEE1 -10.949547 2.6594526 516.2982
## 5095  ENSG00000120738       EGR1   4.377344 4.6205615 218.4309
## 15758 ENSG00000184515       BEX5  -4.938635 0.6839495 188.3181
## 27893 ENSG00000228065  LINC01515  -4.375104 0.9611039 187.7014

 ##             PValue          FDR
## 50201 4.027064e-09 5.809191e-05
## 17959 6.315711e-09 5.809191e-05
## 5095  8.535940e-08 5.107783e-04
## 15758 1.367709e-07 5.107783e-04
## 27893 1.388286e-07 5.107783e-04
  • 我想知道为什么我在qlf_H9.contrasts中得到的结果与qlf_H9.KOvswt不同吗?例如MAGEE1qlf_H9.contrastsqlf_H9.KOvswt基因的LogFC是相同的,但是其FDR是不同的,并且这些FDR的差异导致我丢失了许多重要基因并拥有不同的列表考虑qlf_H9.contrastsqlf_H9.KOvswt时的基因数量?

  • 比较(Anova或成对)哪个更好?

是因为我的design.matrix_H9姓氏还是contrast_H9的级别名与数据的姓氏不同!

事实上,我喜欢在H9.ki vs wt和H9.ko vs wt中找到差异表达基因,然后找到它们之间常见的上下调节基因。我想知道哪个比较好?

0 个答案:

没有答案