我需要在35行(微阵列)中找到差异表达的基因。 30行'名称以RAL和5行开头'从ZI开始。我想在30条RAL线和5条ZI线之间进行对比。由于我不想手动输入全部150个,我想使用makeContrast。
我的数据是:
dput(sampletype)
structure(c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L,
5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 8L, 8L, 8L, 9L, 9L, 9L, 10L,
10L, 10L, 11L, 11L, 11L, 12L, 12L, 12L, 13L, 13L, 13L, 14L, 14L,
14L, 15L, 15L, 15L, 16L, 16L, 16L, 17L, 17L, 17L, 18L, 18L, 18L,
19L, 19L, 19L, 20L, 20L, 20L, 21L, 21L, 21L, 22L, 22L, 22L, 23L,
23L, 23L, 24L, 24L, 24L, 25L, 25L, 25L, 26L, 26L, 26L, 27L, 27L,
27L, 28L, 28L, 28L, 29L, 29L, 29L, 30L, 30L, 30L, 31L, 31L, 32L,
32L, 32L, 33L, 33L, 33L, 34L, 34L, 34L, 35L, 35L, 35L), .Label = c("RAL307",
"RAL820", "RAL705", "RAL765", "RAL852", "RAL799", "RAL301", "RAL427",
"RAL437", "RAL315", "RAL357", "RAL304", "RAL391", "RAL313", "RAL486",
"RAL380", "RAL859", "RAL786", "RAL399", "RAL358", "RAL360", "RAL517",
"RAL639", "RAL732", "RAL379", "RAL555", "RAL324", "RAL774", "RAL42",
"RAL181", "ZI50N", "ZI186N", "ZI357N", "ZI31N", "ZI197N"), class = "factor")
design.matrix <- model.matrix(~ 0 + sample types)
如何获得对比度,例如&#34; RAL517-ZI50&#34;,&#34; RAL852-ZI50&#34;,&#34; RAL517-ZI42&#34;,&#34; RAL852- ZI42&#34; ?
反正我能做到吗?
这些来自我的sessionInfo():
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
locale:
[1] C
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] gplots_2.12.1 reshape2_1.2.2 ggplot2_0.9.3.1 affy_1.38.1 vsn_3.28.0 Biobase_2.20.1
[7] BiocGenerics_0.6.0 limma_3.16.8
loaded via a namespace (and not attached):
[1] BiocInstaller_1.10.4 KernSmooth_2.23-10 MASS_7.3-29 RColorBrewer_1.0-5 affyio_1.28.0
[6] bitops_1.0-6 caTools_1.14 colorspace_1.2-4 dichromat_2.0-0 digest_0.6.3
[11] gdata_2.13.2 grid_3.0.2 gtable_0.1.2 gtools_3.1.0 labeling_0.2
[16] lattice_0.20-23 munsell_0.4.2 plyr_1.8 preprocessCore_1.22.0 proto_0.3-10
[21] scales_0.2.3 stringr_0.6.2 tools_3.0.2 zlibbioc_1.6.0
由于
答案 0 :(得分:1)
因为你有两个类之间的类比较问题我建议你阅读Bioconductor的limma包的用户指南,这是一个用于识别差异表达基因(http://www.bioconductor.org/packages/release/bioc/vignettes/limma/inst/doc/usersguide.pdf)的流行包。如果您正在使用单色微阵列,您可以专注于9.2节。
顺便说一下,你必须创建一个两级因子来进行比较:
# build the design matrix
library(limma)
yourfactor <- c(rep("RAL", 30),rep("ZI", 5))
design <- model.matrix(~ 0 + yourfactor)
colnames(design) <- gsub("yourfactor", "", colnames(design)) # to simplify the colnames of design
# perform the comparison
fit <- lmFit(data, design) # data is your gene expression matrix
contrast.matrix <- makeContrasts(RAL-ZI, levels=design)
fit2 <- contrasts.fit(fit, contrast.matrix)
fit2 <- eBayes(fit2)
# summarize the results of the linear model
results <- topTable(fit2, number=nrow(data), adjust.method="BH")
请注意表达式矩阵中的样本和因子中的样本标签的顺序相同。为了避免这种问题,我建议你创建一个ExpressionSet对象(http://www.bioconductor.org/packages/release/bioc/html/Biobase.html),这对于操作基因表达数据非常有用。
我希望这很有帮助,
最佳。
利玛