使用季度数据

时间:2015-07-03 03:52:40

标签: r ggplot2

我有两组计数的数据集,汇总到季度计数。 Date_Qtr变量源自具有lubridate的较大数据集。数据框如下。

dat = structure(list(Group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("TypeA", 
"TypeB"), class = "factor"), Date_Qtr = c(2011.1, 2011.2, 2011.3, 
2011.4, 2012.1, 2012.2, 2012.3, 2012.4, 2013.1, 2013.2, 2013.3, 
2013.4, 2014.1, 2014.2, 2014.3, 2014.4, 2015.1, 2015.2, 2011.1, 
2011.2, 2011.3, 2011.4, 2012.1, 2012.2, 2012.3, 2012.4, 2013.1, 
2013.2, 2013.3, 2013.4, 2014.1, 2014.2, 2014.3, 2014.4, 2015.1, 
2015.2), Counts = c(105L, 82L, 72L, 79L, 93L, 118L, 81L, 96L, 
84L, 83L, 84L, 81L, 99L, 103L, 111L, 80L, 127L, 107L, 54L, 51L, 
64L, 64L, 53L, 65L, 78L, 63L, 92L, 61L, 80L, 71L, 88L, 66L, 67L, 
57L, 75L, 59L)), .Names = c("Group", "Date_Qtr", "Counts"), class = "data.frame", row.names = c(NA, 
-36L))

我在ggplot2中绘制了一个时间序列,如下所示,Date_Qtr变量为scale_x_continuous。以前,当我绘制月度数据时,很容易按季度分配休息时间。

ggplot(dat, aes(x = Date_Qtr, y = Counts)) + 
  geom_point( aes( color = Group ), size = 3) + 
  geom_line(aes(color = Group), size = 0.8) +
  scale_y_continuous("Number of things", 
                     limits = c(0, 150)) +
  scale_x_continuous("Year and quarter when things were counted") +
  theme_bw() + 
  theme(axis.text.x = element_text(angle = 45, vjust = 0.5),
        legend.title = element_blank(),
        legend.position = c(0.4, 0.85))

是否有可能以连续的比例将数据表示为每个数据点的实际季度,最好采用“2012年1月至3月”等格式。

提前致谢。

2 个答案:

答案 0 :(得分:6)

您可以将Date用于x轴:

library(ggplot2)
library(scales)
library(zoo)

make_date <- function(x) {
  year <- floor(x)
  x <- year + (x - year)/0.4 - 0.125
  as.Date(as.yearqtr(x))
}

format_quarters <- function(x) {
  x <- as.yearqtr(x)
  year <- as.integer(x)
  quart <- as.integer(format(x, "%q"))

  paste(c("Jan-Mar","Apr-Jun","Jul-Sep","Oct-Dec")[quart], 
        year)
}


ggplot(dat, aes(x = make_date(Date_Qtr), y = Counts)) + 
  geom_point( aes( color = Group ), size=3) + 
  geom_line(aes(color = Group), size=0.8) +
  scale_y_continuous("Number of things", 
                     limits=c(0,150)) +
  scale_x_date("Year and quarter when things were counted",
                     breaks = date_breaks("3 months"),
                     labels = format_quarters) +
  theme_bw() + 
  theme(axis.text.x = element_text(angle=45, vjust = 0.5),
        legend.title=element_blank(),
        legend.position = c(.4,0.85))

resulting plot

答案 1 :(得分:2)

您可以通过向labels添加scale_x_continuous参数来获取所需的标签。

另一个问题是Date_Qtr在季度中使用0.1,0.2,0.3和0.4,因此在x轴上,每个季度内的数字不在数字上。为了解决这个问题,我添加了一个Date_Qtr_New列,其中四分之一正确间隔。

我还将轴标题移到单独的labs语句中,只是为了减少混乱。

# Create new date-quarter values representing actual numerical distance in time
dat$Date_Qtr_New = floor(dat$Date_Qtr) + (as.numeric(gsub(".*\\.([1-4])","\\1", dat$Date_Qtr)) - 1) * 0.25

ggplot(dat, aes(x = Date_Qtr_New, y = Counts)) + 
  geom_point( aes( color = Group ), size=3) + 
  geom_line(aes(color = Group), size=0.8) +
  scale_y_continuous(limits=c(0,150)) +
  # Set quarterly breaks and use labels argument to get the labels we want
  scale_x_continuous(breaks=seq(2011,2016.75,0.25), 
                     labels=paste(c("Jan-Mar","Apr-Jun","Jul-Sep","Oct-Dec"), 
                                  rep(2011:2016,each=4))) +
  labs(x="Year and quarter when things were counted",
       y="Number of things") +
  theme_bw() + 
  theme(axis.text.x = element_text(angle=45, vjust = 0.5),
        legend.title=element_blank(),
        legend.position = c(.4,0.85))

enter image description here