用Granges轨迹注释核型

时间:2017-05-16 13:36:41

标签: r ggplot2

我想创建人类hg19的核型图/表意文字,其另外包含具有数字数据的轨迹,例如, p值,作为点图。

    CHROM    POS  fisher_het
 1:    10 134775 0.299587633
 2:    10 135237 1.000000000
 3:    10 135277 0.483198279
 4:    10 135331 0.224587437
 5:    10 135334 0.068035761
 6:    10 135656 0.468998144
 7:    10 135708 0.746611845
 8:    10 135801 0.242257762
 9:    10 135853 0.001234701
10:    10 137186 0.774670848

我得到了这一点,我可以在默认的ggplot图中绘制p值,以及类似的东西。

ggplot(data = my.data.frame,],
       aes(POS, -1 * log(fisher_het))
       ) + 
    geom_point(alpha = 0.5, size = 0.8) + 
    facet_wrap(~ CHROM) + 
    labs(x = "Position on chromosome", y = "-log(raw p-Value)") + 
    theme(axis.text.x = element_text(angle = 45, hjust = 1)) + 
    geom_hline(yintercept = 8, size = 0.3) 

pvalues along chromosome 22

现在将这些p值绘制在核型/表意文字上作为每个染色体核型图之上的单独轨迹会很好。

human karyogram

为此,我在很大程度上遵循了亚历山大滑冰的问题(https://www.biostars.org/p/152969/)。

library("ggbio")
library("GenomicRanges")

# load banding data
data(hg19IdeogramCyto, package = "biovizBase")
hg19 <- keepSeqlevels(hg19IdeogramCyto, paste0("chr", c(1:22, "X", "Y")))

# create a test GRanges object
# from the test data given above
test.granges <- GRanges(seqnames = paste0("chr", df.test.data$CHROM),
                        ranges=IRanges(start = df.test.data$POS,
                                       end = df.test.data$POS),
                        strand = "*",
                        fisher_het = df.test.data$fisher_het)

# attach chromosome lengths
data(hg19Ideogram, package = "biovizBase")
seqlengths(test.granges) <- seqlengths(hg19Ideogram)[names(seqlengths(test.granges))]

# plotting trials
# this works - a set of chromosomes with banding
ggplot(hg19) + layout_karyogram(cytoband = TRUE)

# now i want to add my p-values following the pattern of Alexander Skates
# + layout_karyogram(avs.granges, geom = "rect", ylim = c(11, 21), color = "red")
# using geom = "point"
ggplot(hg19) + 
     layout_karyogram(cytoband = TRUE) + 
     layout_karyogram(data = test.granges, 
                      geom = "point", 
                      aes(x=???, y=fisher_het)
                      )

我无法在GRanges对象或ggplot / ggbio接受的任何其他对象中提供有效的x和y坐标。

# also tried to no avail
ggplot(hg19) + 
    layout_karyogram(cytoband = TRUE) + 
     plotRangesLinkedToData(data = test.granges)

1 个答案:

答案 0 :(得分:0)

最后,如果找到正确的URL,结果并不是很困难。

http://www.tengfei.name/ggbio/docs/man/layout_karyogram-method.html指向每条染色体核型图上方的线图。

  • 要使用的GRanges对象的x坐标为start

  • ylim定义了点子图的大小

  • size需要减少一点,因为默认点大小会产生几乎不可读的情节

ggplot(hg19) + 
     layout_karyogram(cytoband = TRUE) + 
     layout_karyogram(data = test.granges, 
                      geom = "point", 
                      aes(x=start, y=fisher_het),
                      ylim = c(10,50),
                      color = "black",
                      size = 0.4
                      )

Ideogram with additional geom_point track