makeContrast在两组不同的数据之间

时间:2014-04-17 14:41:40

标签: r bioconductor contrast

我需要在35行(微阵列)中找到差异表达的基因。 30行'名称以RAL和5行开头'从ZI开始。我想在30条RAL线和5条ZI线之间进行对比。由于我不想手动输入全部150个,我想使用makeContrast。

我的数据是:

dput(sampletype)

structure(c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 
5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 8L, 8L, 8L, 9L, 9L, 9L, 10L, 
10L, 10L, 11L, 11L, 11L, 12L, 12L, 12L, 13L, 13L, 13L, 14L, 14L, 
14L, 15L, 15L, 15L, 16L, 16L, 16L, 17L, 17L, 17L, 18L, 18L, 18L, 
19L, 19L, 19L, 20L, 20L, 20L, 21L, 21L, 21L, 22L, 22L, 22L, 23L, 
23L, 23L, 24L, 24L, 24L, 25L, 25L, 25L, 26L, 26L, 26L, 27L, 27L, 
27L, 28L, 28L, 28L, 29L, 29L, 29L, 30L, 30L, 30L, 31L, 31L, 32L, 
32L, 32L, 33L, 33L, 33L, 34L, 34L, 34L, 35L, 35L, 35L), .Label = c("RAL307", 
"RAL820", "RAL705", "RAL765", "RAL852", "RAL799", "RAL301", "RAL427", 
"RAL437", "RAL315", "RAL357", "RAL304", "RAL391", "RAL313", "RAL486", 
"RAL380", "RAL859", "RAL786", "RAL399", "RAL358", "RAL360", "RAL517", 
"RAL639", "RAL732", "RAL379", "RAL555", "RAL324", "RAL774", "RAL42", 
"RAL181", "ZI50N", "ZI186N", "ZI357N", "ZI31N", "ZI197N"), class = "factor")

design.matrix <- model.matrix(~ 0 + sample types)

如何获得对比度,例如&#34; RAL517-ZI50&#34;,&#34; RAL852-ZI50&#34;,&#34; RAL517-ZI42&#34;,&#34; RAL852- ZI42&#34; ?

反正我能做到吗?

这些来自我的sessionInfo():

> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] C

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] gplots_2.12.1      reshape2_1.2.2     ggplot2_0.9.3.1    affy_1.38.1        vsn_3.28.0         Biobase_2.20.1    
[7] BiocGenerics_0.6.0 limma_3.16.8      

loaded via a namespace (and not attached):
 [1] BiocInstaller_1.10.4  KernSmooth_2.23-10    MASS_7.3-29           RColorBrewer_1.0-5    affyio_1.28.0        
 [6] bitops_1.0-6          caTools_1.14          colorspace_1.2-4      dichromat_2.0-0       digest_0.6.3         
[11] gdata_2.13.2          grid_3.0.2            gtable_0.1.2          gtools_3.1.0          labeling_0.2         
[16] lattice_0.20-23       munsell_0.4.2         plyr_1.8              preprocessCore_1.22.0 proto_0.3-10         
[21] scales_0.2.3          stringr_0.6.2         tools_3.0.2           zlibbioc_1.6.0       

由于

1 个答案:

答案 0 :(得分:1)

因为你有两个类之间的类比较问题我建议你阅读Bioconductor的limma包的用户指南,这是一个用于识别差异表达基因(http://www.bioconductor.org/packages/release/bioc/vignettes/limma/inst/doc/usersguide.pdf)的流行包。如果您正在使用单色微阵列,您可以专注于9.2节。

顺便说一下,你必须创建一个两级因子来进行比较:

# build the design matrix

library(limma)

yourfactor <- c(rep("RAL", 30),rep("ZI", 5))
design <- model.matrix(~ 0 + yourfactor)
colnames(design) <- gsub("yourfactor", "", colnames(design)) # to simplify the colnames of design

# perform the comparison


fit <- lmFit(data, design)    # data is your gene expression matrix
contrast.matrix <- makeContrasts(RAL-ZI, levels=design)
fit2 <- contrasts.fit(fit, contrast.matrix)
fit2 <- eBayes(fit2)

# summarize the results of the linear model
results <- topTable(fit2, number=nrow(data), adjust.method="BH")

请注意表达式矩阵中的样本和因子中的样本标签的顺序相同。为了避免这种问题,我建议你创建一个ExpressionSet对象(http://www.bioconductor.org/packages/release/bioc/html/Biobase.html),这对于操作基因表达数据非常有用。

我希望这很有帮助,

最佳。

利玛