我的数据框如下所示。 Build用于改变每周twise或 一旦。每当它发生变化时,都需要在ggplot中进行识别 图表通过添加一个散点图(让我知道是否有更好的想法) 使用相同的x轴(日期)。
为此,我想在此数据框中再添加一列。
date build Runtime
2 2013-07-16 build-2013-07-09-1332 672.918
4 2013-07-17 build-2013-07-15-0510 696.924
6 2013-07-18 build-2013-07-15-0510 736.720
8 2013-07-19 build-2013-07-18-1644 693.206
10 2013-07-20 build-2013-07-18-1644 699.332
12 2013-07-24 build-2013-07-22-0510 712.388
14 2013-07-25 build-2013-07-22-0510 711.573
16 2013-07-26 build-2013-07-22-0510 715.223
18 2013-07-27 build-2013-07-22-0510 715.180
20 2013-07-31 build-2013-07-29-0510 717.888
22 2013-08-01 build-2013-07-29-0510 716.315
24 2013-08-02 build-2013-07-29-0510 719.216
26 2013-08-03 build-2013-07-29-0510 716.073
28 2013-08-07 build-2013-08-05-0510 717.566
添加了另一个名为BuildChange的列,如下所示。使用awk命令来做 同样的。
cat q.txt | awk 'BEGIN{CBD=""}{if($3 != CDB){print $2","$3","$4","1}else{print $2","$3","$4","0}CDB=$3;}'
date build Runtime BuildChange
2 2013-07-16 build-2013-07-09-1332 672.918 5
4 2013-07-17 build-2013-07-15-0510 696.924 5
6 2013-07-18 build-2013-07-15-0510 736.720
8 2013-07-19 build-2013-07-18-1644 693.206 5
10 2013-07-20 build-2013-07-18-1644 699.332
12 2013-07-24 build-2013-07-22-0510 712.388 5
14 2013-07-25 build-2013-07-22-0510 711.573
16 2013-07-26 build-2013-07-22-0510 715.223
18 2013-07-27 build-2013-07-22-0510 715.180
20 2013-07-31 build-2013-07-29-0510 717.888 5
22 2013-08-01 build-2013-07-29-0510 716.315
24 2013-08-02 build-2013-07-29-0510 719.216
26 2013-08-03 build-2013-07-29-0510 716.073
28 2013-08-07 build-2013-08-05-0510 717.566 5
我想在for循环中做同样的事情。是否有更好的想法添加一个 更多列并在图表中显示构建的更改。
结果图但我想要没有上轴和右轴
我的数据框的dput()
structure(list(date = structure(1:28, .Label = c("2013-07-16",
"2013-07-17", "2013-07-18", "2013-07-19", "2013-07-20", "2013-07-24",
"2013-07-25", "2013-07-26", "2013-07-27", "2013-07-31", "2013-08-01",
"2013-08-02", "2013-08-03", "2013-08-07", "2013-08-08", "2013-08-09",
"2013-08-10", "2013-08-14", "2013-08-15", "2013-08-16", "2013-08-17",
"2013-08-21", "2013-08-22", "2013-08-23", "2013-08-24", "2013-08-28",
"2013-08-29", "2013-08-30", "2013-08-31", "2013-09-04", "2013-09-05",
"2013-09-06", "2013-09-07", "2013-09-11", "2013-09-12", "2013-09-13",
"2013-09-18", "2013-09-19", "2013-09-20", "2013-09-21", "2013-09-25",
"2013-09-26", "2013-09-27", "2013-09-28", "2013-10-02", "2013-10-03",
"2013-10-04", "2013-10-05", "2013-10-09", "2013-10-10", "2013-10-11",
"2013-10-12", "2013-10-16", "2013-10-17", "2013-10-18", "2013-10-19",
"2013-10-23", "2013-10-24", "2013-10-25", "2013-10-26", "2013-10-30",
"2013-10-31", "2013-11-01", "2013-11-02", "2013-11-06", "2013-11-07",
"2013-11-08", "2013-11-09"), class = "factor"), build = structure(c(1L,
2L, 2L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L,
7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 9L, 9L, 9L), .Label = c("build-2013-07-09-1332",
"build-2013-07-15-0510", "build-2013-07-18-1644",
"build-2013-07-22-0510", "build-2013-07-29-0510",
"build-2013-08-05-0510", "build-2013-08-13-1329",
"build-2013-08-20-0510", "build-2013-08-27-0510",
"build-2013-09-03-1340", "build-2013-09-10-1326",
"build-2013-09-17-0510", "build-2013-09-26-0510",
"build-2013-10-08-1359", "build-2013-10-14-0510",
"build-2013-10-18-1437", "build-2013-10-18-1437-PLUS-11259-11737",
"build-2013-10-28-0510", "build-2013-11-04-0510"
), class = "factor"), Runtime = c(672.918, 696.924, 736.72, 693.206,
699.332, 712.388, 711.573, 715.223, 715.18, 717.888, 716.315,
719.216, 716.073, 717.566, 723.644, 720.374, 726.145, 710.658,
715.002, 718.742, 727.297, 711.684, 714.743, 715.815, 726.467,
742.33, 746.352, 749.55)), .Names = c("date", "build", "Runtime"
), row.names = c(2L, 4L, 6L, 8L, 10L, 12L, 14L, 16L, 18L, 20L,
22L, 24L, 26L, 28L, 30L, 32L, 34L, 36L, 38L, 40L, 42L, 44L, 46L,
48L, 50L, 52L, 54L, 55L), class = "data.frame")
答案 0 :(得分:1)
这个怎么样 - square显示该构建的平均运行时间。请注意不需要新列。
require(plyr)
require(ggplot2)
df1$date<-(as.Date(df1$date))
ggplot(data=df1)+
geom_line(aes(date,Runtime))+
geom_point(data=ddply(df1,.(build),summarize,firstdate=min(date),avruntime=mean(Runtime)),
aes(firstdate,avruntime),
shape=22,
size=5,
fill="red")
答案 1 :(得分:1)
如果df
是您的数据框,那么这些内容应该可以帮助您入门。
library(ggplot2)
# identify change in build
df$buildchange <- c(1,as.integer(diff(df$build))
df[df$buildchange==0,"buildchange"]=NA
#plot
p1 <- ggplot(
data = df,
aes(
x = date)) +
geom_line(
aes(
y = Runtime,
group = 1,
colour = "Runtime")
) +
geom_point(
aes(
y = Runtime*buildchange,
size = 5,
colour = "Build Change")
) +
theme(axis.text.x = element_text(angle = 90, hjust = 1))
答案 2 :(得分:1)
每次发布新版本时,我都会勾选X轴。从原始图中,您可以提取y的最小值,并将所有刻度的y轴设置为此最小值,因此此代码是通用的,并在运行时降至当前低点以下时继续工作。将原始数据放入Dat
这是代码
p <- ggplot(Dat, aes(date, Runtime)) + geom_line()
buildElements <- strsplit(as.character(Dat$build), split = "-")
pasteBE <- function(x) paste(x[2],x[3],x[4], sep = "-")
Dat2 <- data.frame(
newBuild = as.Date(unique(sapply(buildElements, pasteBE))),
yMin = ggplot_build(p)$panel$ranges[[1]]$y.range[1])
p + geom_point(data = Dat2, aes(newBuild, yMin), col = "red", size = 2)
答案 3 :(得分:0)
别介意dput()
它没有构建更改列。
我想你想要这样的东西? buildChange数据导入为NA,因此不显示为点。然后我只计算最大运行时间并轻推图形上方的点,以便y轴可以很好地缩放。哈基但很好......这是午餐时间。
# I added an id identifier to the first column of q.txt
DF= read.table('q.txt', header=T, fill=T, colClasses=c('numeric', 'Date', 'factor', 'numeric', 'numeric'))
library(ggplot2)
theme_set(theme_bw())
m=max(DF$Runtime)
qplot(date,Runtime, geom='line',data=DF)+geom_point(aes(x=date, y=BuildChange+m+5))
给了我: