如何在R和ggplot2中仅为系列的第一个和最后一个值添加标签

时间:2016-04-10 05:59:10

标签: r ggplot2

我有CSV格式的数据

Date,Timestamp,Bugs overall,Open overall,Enhancements,Actual bugs,Needinfo,Workable bugs,Bugs with patch
2016-03-13 22:40,1457905248276,3626,454,91,363,113,250,6
2016-03-29 21:31,1459279868888,3642,453,91,362,118,244,7
2016-04-06 21:55,1459972517328,3652,447,92,355,116,239,7
2016-04-10 07:30,1460266206486,3655,446,93,353,116,237,9
...

我加载数据,使用melt()转换它,然后将其绘制如下

library(parsedate)
data <- read.csv("stats.csv", stringsAsFactors=FALSE, sep=",", quote="")

library(reshape2)
datam <- melt(data, id.vars="Date", measure.vars=
    # when you change the order here also adjust it below for the legent
    c("Open.overall","Actual.bugs","Workable.bugs","Needinfo","Enhancements","Bugs.with.patch"))

library(ggplot2)

ggplot(datam, aes(x=parse_date(Date, approx = TRUE), y=value, colour=variable)) +
    # draw lines with dots and inner white-circles to get a "subway map"-like effect
    geom_line(size = 2) +
    geom_point(size = 3) +
    geom_point(size = 1.50, color = "white") +
    # add fitted regression lines
    geom_smooth(data=subset(datam, parse_date(Date, approx = TRUE) >= parse_date("2015-08-10 10:21")), 
                            method="lm", level=0.99, linetype="dashed") +
    # pin the y-axis at zero
    expand_limits(y=0) +
    # start after the change in how we compute the values, it's actually at 2015-08-10 10:21
    #xlim(parse_date("2015-08-19 00:00"), max(parse_date(datam$Date, approx = TRUE))) +
    # specify the label for both axes
    xlab("Date") +
    ylab("Number of issues") +
    # add more ticks on the y-axis
    scale_y_continuous(breaks=seq(0,1000,by=100)) +
    # set a title for the graph
    ggtitle("Open bugs in Apache POI") + 
    # set the default black/white theme
    theme_bw() +
    # legend styling
    theme(legend.position="right",
          legend.key = element_rect(fill = "white")) +
    scale_colour_discrete(labels=
        # this list needs to match the order used above!
        c("Open","Bugs","Workable Bugs","Needinfo","Enhancements","Bugs with Patch")) +
    guides(colour=guide_legend(title=NULL))

这会产生一个漂亮的图表

enter image description here

现在我想添加数据标签,但仅限于每个系列中的第一个和最后一个项目,以便图表不会被数字混乱,但仍然指示起点和当前状态。

我看到有geom_text()和geom_label(),但我找不到如何只为每个系列中的第一个和最后一个项目应用它们。

我希望避免使用固定日期,因为基础数据会不时更新更新的值。

1 个答案:

答案 0 :(得分:0)

创建一个data.frame,其结构与您在绘图中使用的结构完全相同,但只包含要显示为标签的第一个和最后一个元素。将此data.frame用于geom_label,并将labels美学与value映射。