dput(x)的
structure(list(Host = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L), .Label = "A", class = "factor"), TimeStamp = structure(c(1L,
2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L), .Label = c("1/11/2013",
"1/12/2013", "1/13/2013", "1/14/2013", "1/15/2013"), class = "factor"),
Instance = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
2L), .Label = c("/application", "/db"), class = "factor"),
Free_Space = c(5048L, 5049L, 6000L, 4800L, 5100L, 317659L,
340000L, 350000L, 356666L, 370000L), Used_Space = c(3017L,
56000L, 60000L, 55000L, 54000L, 271657L, 150000L, 175000L,
165000L, 189999L), Total_Space = c(8064L, 61049L, 66000L,
59800L, 59100L, 589316L, 490000L, 525000L, 521666L, 559999L
)), .Names = c("Host", "TimeStamp", "Instance", "Free_Space",
"Used_Space", "Total_Space"), class = "data.frame", row.names = c(NA,
-10L))
我有这个数据框。我通过在给定Host,TimeStamp和Instance的data.table包中添加Free_Space和Used_Space来驱动列名Total_Space。
x<-data.table(x)
x<-x[,Total_Space:=Free_Space+Used_Space, by=c("Host", "Instance", "TimeStamp")]
我喜欢使用ggplot2中的ggplot facet_wrap来绘制GB中的已用空间,并通过Total_Space绘制一个geom_line,以便用户可以看到有多少空间。
例如,我这样做:
ggplot(x, aes(TimeStamp, Used_Space/1024, group=Instance)) + geom_area(fill="blue") + geom_smooth(method="lm", colour="orange",se=T, size=1) + geom_hline(data=x, aes(yintercept = Total_Space/1024), col="red")+ facet_wrap(~Host+Instance, ncol=3, scales="free")
我看到的问题是,由于Total_Space正在发生变化,我为同一个实体和主机获得了多个geom_hline。
我的问题是,在为每个实例和主机执行geom_hline时,如何选择最新的时间戳?我需要在geom_hline中显示最新的Total_Space。
我试过这种方法:
x&lt; -x [,LatestTS:= tail(p [order(p $ TimeStamp),],1)$ Total_Space,by = c(“Host”,“Instance”,“TimeStamp”)]
没用。它为所有实例选择相同的数字。
答案 0 :(得分:3)
首先,我的解决方案是将您的专栏TimeStamp
设为日期
x$TimeStamp<-as.Date(x$TimeStamp,format="%m/%d/%Y")
然后,由于您的数据对象为data.table
,您可以根据Host
和Instance
对数据进行分组,并设置TimeStamp
应该是最大值。
x[,.SD[TimeStamp==max(TimeStamp)],by="Host,Instance"]
Host Instance TimeStamp Free_Space Used_Space Total_Space
1: A /application 2013-01-15 5100 54000 59100
2: A /db 2013-01-15 370000 189999 559999
现在您可以在geom_hline()
中使用此行。使用scale_x_date()
,您将获得更多控制此比例的可能性。
library(scales)
ggplot(x, aes(TimeStamp, Used_Space/1024, group=Instance)) +
geom_area(fill="blue") + geom_smooth(method="lm", colour="orange",se=T, size=1) +
geom_hline(data=x[,.SD[TimeStamp==max(TimeStamp)],by="Host,Instance"], aes(yintercept = Total_Space/1024), col="red")+
facet_wrap(~Host+Instance, ncol=3, scales="free") +
scale_x_date(labels = date_format("%m/%d/%Y"))