使步进功能采用90度过渡

时间:2016-04-19 20:21:36

标签: r ggplot2

我怀疑这可能是一个非常愚蠢的问题,但是这里有! (同样,如果这更适合CrossValidated道歉,我现在还不确定这是编程问题,还是需要更接近统计数据......

我用cumSeg(a.k.a阶梯函数,a.k.a peicewise常数函数)创建了一个阶梯函数,并将其拟合到一些不连续(x轴)数据,如下面的代码/图所示。

这一切都很好,我很高兴它,但我想知道我是否可以使步进功能(红线)有一个垂直过渡(即使功能的两个'肩膀'都是90度) 。为了做到这一点,x轴上的值必须在当前的2个跳跃点之间。这可能吗?

如果是这样,它会带给我另一个问题,一个人怎么可能代表圣。如果它具有这些90度过渡和垂直下降,那么该图表上该行的偏差?

# Plotting step-functions on to GC-operon data.

require(ggplot2)
library("ggplot2")
require(reshape2)
library("reshape2")
require(scales)
library(RColorBrewer)
library(cumSeg)

df <- structure(list(PVC1 = 0.4019026, PVC2 = 0.4479259, PVC3 = 0.4494118, PVC4 = 0.4729437,
                 PVC5 = 0.4800556, PVC6 = 0.449229, PVC7 = 0.4905295, PVC8 = 0.4457566,
                 PVC9 = 0.4271259, PVC10 = 0.4850341, PVC11 = 0.4369965, PVC12 = 0.4064052,
                 PVC13 = 0.3743776, PVC14 = 0.3603853, PVC15 = 0.3965469, PVC16 = 0.365461),
                .Names = c("PVC1","PVC2","PVC3","PVC4","PVC5","PVC6","PVC7","PVC8",
                           "PVC9","PVC10","PVC11","PVC12","PVC13","PVC14","PVC15","PVC16"),
                class = "data.frame",
                row.names = c(NA, -1L)
            )

melted_df <- melt(df, variable.name = "Locus", value.name = "GC")
st_dev <- c(0.023031363,    0.024919217,    0.017371129,
        0.019008759,    0.026650605,    0.026904926,
        0.024227542,    0.017767553,    0.026152478,
        0.039770898,    0.023929714,    0.028845442,
        0.015572219,    0.024967336,    0.014955416,    0.024569096)

operon_gc <- 0.408891366
opgc_stdev <- 0.015712091
genome_gc <- 0.425031611
gengc_stdev <- 0.007587437

stepfunc <- jumpoints(y=melted_df$GC*100, k=1, output="1")

gc_chart <- ggplot(melted_df, aes(Locus, GC*100, fill=Locus,)) +
                geom_bar(width=0.6, stat = "identity")
gc_chart <- gc_chart + ylab("GC Content (%)")
gc_chart <- gc_chart + theme(axis.text.x = element_text(angle=45, hjust=1))
gc_chart <- gc_chart + geom_abline(intercept=operon_gc*100,
                               slope=0,
                               colour="gray",
                               linetype=3,
                               show.legend =TRUE)
gc_chart <- gc_chart + geom_text(aes(15.7, 41.7, label="Operon GC"),
                             size=5,
                             color="gray")
gc_chart <- gc_chart + geom_abline(intercept=genome_gc*100,
                               slope=0,
                               colour="black",
                               linetype=3,
                               show.legend = TRUE)
 gc_chart <- gc_chart + geom_text(aes(15.7, 43.3, label="Genome GC"),
                             size=5,
                             color="black")
 gc_chart <- gc_chart + coord_cartesian(ylim=c(30,55))
 gc_chart <- gc_chart + geom_errorbar(width=.2, size=0.4, color="azure4", aes(Locus, 
                ymin = (GC - cbind(melted_df, st_dev)$st_dev)*100, 
                ymax = (GC + cbind(melted_df, st_dev)$st_dev)*100))
gc_chart <- gc_chart + geom_line(linetype=2,aes(x=as.numeric(Locus), y=stepfunc$fitted.values, colour="red", group=1))

gc_chart

enter image description here

编辑:@Gregor,geom_step取得了预期的效果谢谢(在代码中用line代替step生成: enter image description here

然而,但是那个图表,断点将是PVC12。然而,从函数中提取断点值...

> stepfunc$psi
 V 
11 

在这种情况下,图表会产生误导,也许我最好使用之前的版本,这只是证明在11到12之间有一个中断。

1 个答案:

答案 0 :(得分:1)

ggplot非常适合绘制您提供的数据。您的x值为1:12,您的y值为45,然后是37。如果您不希望以y的整数值定义x值(以及y值的更改),请更改您的值!

step_df = data.frame(x = c(1, 11.5, 16), y = c(45.24568, 37.5, 37.5))

gc_chart + geom_step(data = stepdf, linetype=2,
                     aes(x = x, y = y, colour="red", group=1),
                     inherit.aes = F)

我会以编程方式为您定义合适的step_df