我有一个小组治疗,其中有2种治疗方法和一个对照,并且我使用EdgeR进行了DE。我的设计矩阵如下,并使用成对和方差分析对这些样品进行对比。尽管其DEg基因的LigFC相同,但其FDR不同。
data
Ensembl_ID Gene_name H9_wt_B1 H9_wt_B2 H9_KO_B1 H9_KO_B2 H9_KI_B1 H9_KI_B2
## 1 ENSG00000000003 TSPAN6 4274 4233 3755 4937 4681 4061
## 2 ENSG00000000005 TNMD 116 117 106 154 105 86
## 3 ENSG00000000419 DPM1 2443 2391 2389 2597 3026 2453
design
## samples
## H9_wt_B1 H9_wt
## H9_wt_B2 H9_wt
## H9_KO_B1 H9_KO
## H9_KO_B2 H9_KO
## H9_KI_B1 H9_KI
## H9_KI_B2 H9_KI
all(rownames(design) %in% colnames(data))
## [1] TRUE
all(rownames(design) == colnames(data))
## [1] TRUE
design.matrix_H9 <- model.matrix(~0+samples,design)
design.matrix_H9
## samplesH9_KI samplesH9_KO samplesH9_wt
## H9_wt_B1 0 0 1
## H9_wt_B2 0 0 1
## H9_KO_B1 0 1 0
## H9_KO_B2 0 1 0
## H9_KI_B1 1 0 0
## H9_KI_B2 1 0 0
## attr(,"assign")
## [1] 1 1 1
## attr(,"contrasts")
## attr(,"contrasts")$samples
## [1] "contr.treatment"
H9_list <- DGEList(counts = data[c(3:8)], group=design$samples, genes=data[1:2])
H9_filter <- filterByExpr(H9_list)
H9_keep<-H9_list[H9_filter, , keep.lib.sizes=FALSE]
H9_keep_norm <- calcNormFactors(H9_keep)
H9_dispersion<- estimateDisp(H9_keep_norm, design.matrix_H9, robust=TRUE)
H9_fited <- glmQLFit(H9_dispersion, design.matrix_H9, robust=TRUE)
contrasts_H9 <- makeContrasts(H9.KOvswt = samplesH9_KO - samplesH9_wt,
H9.KIvswt = samplesH9_KI - samplesH9_wt,
H9.KIvsKO = samplesH9_KI - samplesH9_KO, levels=design.matrix_H9)
contrasts_H9
## Contrasts
## Levels H9.KOvswt H9.KIvswt H9.KIvsKO
## samplesH9_KI 0 1 1
## samplesH9_KO 1 0 -1
## samplesH9_wt -1 -1 0
qlf_H9.contrasts <- glmQLFTest(H9_fited, contrast=contrasts_H9)
topTags(qlf_H9.contrasts)
## Coefficient: LR test on 2 degrees of freedom
## Ensembl_ID Gene_name logFC.H9.KOvswt logFC.H9.KIvswt
## 16448 ENSG00000187325 TAF9B 0.047290741 -8.5215156
## 2694 ENSG00000102710 SUPT20H -0.009879572 -9.8090108
## 6131 ENSG00000129317 PUS7L 0.071584025 -13.0416646
## 7625 ENSG00000138161 CUZD1 1.072742182 -3.5320786
## 17959 ENSG00000198934 MAGEE1 -10.949547315 0.9180734
## logFC.H9.KIvsKO logCPM F PValue FDR
## 16448 -8.568806 5.044505 955.5740 1.059827e-11 1.005574e-07
## 2694 -9.799131 5.032961 949.3324 1.093252e-11 1.005574e-07
## 6131 -13.113249 4.246053 834.5035 1.670160e-10 1.024142e-06
## 7625 -4.604821 3.376219 261.1062 4.800669e-09 2.207828e-05
## 17959 11.867621 2.659453 347.6069 6.771092e-09 2.491220e-05
qlf_H9.KOvswt <- glmQLFTest(H9_fited, contrast=contrasts_H9[,"H9.KOvswt"])
topTags(qlf_H9.KOvswt)
## Coefficient: 1*samplesH9_KO -1*samplesH9_wt
## Ensembl_ID Gene_name logFC logCPM F
## 50201 ENSG00000268658 LINC00664 -8.590411 2.3669917 404.3259
## 17959 ENSG00000198934 MAGEE1 -10.949547 2.6594526 516.2982
## 5095 ENSG00000120738 EGR1 4.377344 4.6205615 218.4309
## 15758 ENSG00000184515 BEX5 -4.938635 0.6839495 188.3181
## 27893 ENSG00000228065 LINC01515 -4.375104 0.9611039 187.7014
## PValue FDR
## 50201 4.027064e-09 5.809191e-05
## 17959 6.315711e-09 5.809191e-05
## 5095 8.535940e-08 5.107783e-04
## 15758 1.367709e-07 5.107783e-04
## 27893 1.388286e-07 5.107783e-04
我想知道为什么我在qlf_H9.contrasts
中得到的结果与qlf_H9.KOvswt
不同吗?例如MAGEE1
和qlf_H9.contrasts
中qlf_H9.KOvswt
基因的LogFC是相同的,但是其FDR是不同的,并且这些FDR的差异导致我丢失了许多重要基因并拥有不同的列表考虑qlf_H9.contrasts
或qlf_H9.KOvswt
时的基因数量?
比较(Anova或成对)哪个更好?
是因为我的design.matrix_H9
姓氏还是contrast_H9
的级别名与数据的姓氏不同!
事实上,我喜欢在H9.ki vs wt和H9.ko vs wt中找到差异表达基因,然后找到它们之间常见的上下调节基因。我想知道哪个比较好?