每年比较时间序列ggplot2 R.

时间:2015-03-18 14:27:23

标签: r performance ggplot2 line time-series

我的df:

> head(merged)
        Date patch     prod workmix_pct jobcounts travel FWIHweeklyAvg              month year
1 2013-03-29  BVG1 2.932208         100      9480   30.7      1.627024              March 2013
2 2013-03-29 BVG11 2.769156          10       968   34.3      4.475714              March 2013
3 2013-03-29 BVG12 2.857344          16      1551   33.8      3.098571              March 2013
4 2013-03-29 BVG13 2.870111          13      1267   29.1      1.361429              March 2013
5 2013-03-29 BVG14 3.011260          17      1625   28.1      1.550000              March 2013
6 2013-03-29 BVG15 3.236246          21      1946   24.9      1.392857              March 2013

我试图绘制prod列的年度比较。我有March 2013March 2015的数据。

这就是我的尝试:

ggplot(data=merged,aes(Date, prod)) + #dataframe 
  geom_line(data=merged[merged$patch %in% c("BVG1"),],aes(y=prod, colour="red"),lwd = 1.3,)+ #select BVG1
  geom_smooth() +
        scale_x_date(labels = date_format("%b-%Y"),breaks = "1 month") + #how many breaks and Date format
        ylab("Actual Prod") +
        ggtitle("Scotland's Overall Performance Financial Year\n2013/14 Vs 2014/15") +
        theme(axis.title.y = element_text(size = 25, vjust=0.3,face = "bold",color = "red"), 
        axis.text.y=element_text(size=25, color="blue"),
        plot.title = element_text(lineheight = .8,face = "bold",color = "red",size = 45, vjust = 1),
        legend.text = element_text(size=35))+ theme(legend.position="none")

给了我这个情节:

enter image description here

现在我想绘制2014年至2014年和2015年与2015年相比,以及最后的2013年与2015年相比。

这就是我的尝试:

ggplot(data=merged,aes(Date)) + #dataframe 
  geom_line(data=merged[merged$year==2013,],aes(y=prod, colour="red"),lwd = 1.3,)+ #select 2013
  geom_line(data=merged[merged$year==2014,],aes(y=prod, colour="blue"),lwd = 1.3,)+ #select 2014
        scale_x_date(labels = date_format("%b-%Y"),breaks = "1 month") + #how many breaks and Date format
        ylab("Actual Prod") +
        ggtitle("Scotland's Overall Performance Financial Year\n2013/14 Vs 2014/15") +
        theme(axis.title.y = element_text(size = 25, vjust=0.3,face = "bold",color = "red"), 
        axis.text.y=element_text(size=25, color="blue"),
        plot.title = element_text(lineheight = .8,face = "bold",color = "red",size = 45, vjust = 1),
        legend.text = element_text(size=35))+ theme(legend.position="none")

这是我得到的enter image description here

如果有类似下面的内容会很高兴:

enter image description here

enter image description here

但不是weekly视图,而是monthly视图。

任何帮助或想法都将不胜感激。

非常感谢

更新

根据Ruthger Righart答案。我做了以下事情:

library(dplyr)

mergedYearonYearProdMeans = merged %>%
                                group_by(year,month) %>%
                                mutate(MonthlyAve = mean(prod))
ordered.months <- factor(mergedYearonYearProdMeans$month, as.character(mergedYearonYearProdMeans$month))

ggplot(data=mergedYearonYearProdMeans,aes(ordered.months,MonthlyAve,group=year,shape=year,color=year)) + #dataframe 
  geom_line()+ 
  scale_color_manual(values = c("red","blue","green"))

我的图表从1月份开始+ 2015年Prod应仅适用于1月,2月和3月,并且不应显示其他月份的绿线,如下所示。

enter image description here

1 个答案:

答案 0 :(得分:2)

数据的准备通常对于这类情节最为重要。 看到你的数据我想你需要计算平均“prod”值作为年和月的函数。可以使用ddply函数使用plyr包执行此步骤。一个简单的数据示例,了解其工作原理:

library(plyr)

dat<-data.frame(year=c("2012","2012","2012", "2012","2012","2012"), month=c("Jan", "Jan", "Jan", "Feb", "Feb", "Feb"), prod=as.numeric(c("2.00", "1.00", "3.00", "0.50", "1.50", "2.00")))

newdat<-ddply(dat, .(year, month), summarize, prod = mean(prod)) 

在此步骤之后,您的数据应该在newdat中具有每年和每月的一个平均“prod”值,并且格式正确,因此可以使用ggplot绘制。我创建了一个新的简化数据示例,它具有相同的格式:

df<-data.frame(year=c("2012","2012","2012","2012","2013","2013","2013","2013"), month=c("Jan","Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec", "Jan","Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"), prod=c("0.33","0.24","0.36","0.22","0.31","0.28","0.39","0.25", "0.23","0.22","0.46","0.52","0.61","0.18","0.59","0.55", "0.13","0.14","0.56","0.42","0.41","0.48","0.59","0.65"))

应该使矢量在x轴上获得正确的月份排名(否则ggplot按字母顺序排列月份)

ordmonth<- factor(df$month, as.character(df$month))

library(ggplot2)

p<-ggplot(data=df, aes(x=ordmonth, y=prod, group=year, shape=year, color=year))+geom_line()
p<-p+scale_color_manual(values = c("red", "blue"))

enter image description here