如何使用ggplot2将x轴从几年更改为几个月

时间:2017-06-23 13:52:16

标签: r ggplot2

我在时间图表上进行了网络访问,该图表显示了从2014年到现在的每日流量,看起来像这样:

 ggplot(subset(APRA, Post_Day > "2013-12-31"), aes(x = Post_Day, y = Page_Views))+
   geom_line()+
   scale_y_continuous(labels = comma)+
   ylim(0,50000)

enter image description here

正如你所看到的那样,它不是一个很好的图表,更有意义的是按月分解而不是白天。但是,当我尝试这段代码时:

 ggplot(subset(APRA, Post_Day > "2013-12-31"), aes(x = Post_Day, y = Page_Views))+
   geom_line()+
   scale_y_continuous(labels = comma)+
   ylim(0,50000)+
   scale_x_date(date_breaks = "1 month", minor_breaks = "1 week", labels = date_format("%B"))

我收到此错误:

  

错误:输入无效:date_trans仅适用于类Date的对象

日期字段Post_DayPOSIXctPage_Views是数字。数据如下:

Post_Title  Post_Day    Page_Views
Title 1     2016-05-15  139
Title 2     2016-05-15  61
Title 3     2016-05-15  79
Title 4     2016-05-16  125
Title 5     2016-05-17  374
Title 6     2016-05-17  39
Title 7     2016-05-17  464
Title 8     2016-05-17  319
Title 9     2016-05-18  84
Title 10    2016-05-18  64
Title 11    2016-05-19  433
Title 12    2016-05-19  418
Title 13    2016-05-19  124
Title 14    2016-05-19  422

我希望将X轴从每日粒度更改为每月。

2 个答案:

答案 0 :(得分:2)

问题中显示的样本数据集每天有多个数据点。因此,无论如何,它需要在日常汇总。对于按天或月的汇总,使用--- output: html_document runtime: shiny --- ```{r setup, include=FALSE, echo=FALSE} knitr::opts_chunk$set(warning=FALSE, message=FALSE, echo=FALSE) ``` ```{r} # non reactive stuff library(leaflet) library(rbokeh) library(tidyverse) locs <- structure(list(loc = c("S-US-611: BAD RIVER", "H-US-216: TROUT RIVER", "M-US-67: GIERKE CREEK", "H-US-71: TROUT CREEK", "S-US-13: PENDILLS CREEK", "O-US-67: RICE CR.", "M-US-271: EPHRAIM CREEK", "M-US-674: GIBSON CREEK (HALFWAY CREEK)", "S-US-64: SUCKER RIVER", "M-US-339: EAST TWIN RIVER"), lon = c(-90.652399, -83.826602, -86.336641, -84.103548, -84.819236, -76.56845, -87.179319, -86.206658, -85.942378, -87.563722), lat = c(46.637999, 45.428862, 45.849507, 45.979098, 46.443969, 43.443795, 45.148478, 42.719827, 46.674155, 44.151644), le = c(1.10611, 3.10216, 2.10067, 3.10071, 1.10013, 5.10067, 2.10271, 2.10674, 1.10064, 2.10339)), .Names = c("loc", "lon", "lat", "le"), row.names = c(NA, -10L), class = "data.frame") row.names(locs) <- locs$loc chem <- structure(list(le = c(1.00093, 1.00093, 1.00093, 1.00093, 1.00093, 1.00093, 1.00093, 1.00093, 1.00093, 1.00116, 1.00116, 1.00116, 1.00116, 1.00116, 1.00301, 1.00301, 1.00301, 1.00301, 1.00301, 1.00301, 1.00301, 1.00374, 1.00374, 1.00374, 1.00374, 1.00374, 1.00374, 1.00374, 1.00374, 1.00374, 1.00374, 1.00374, 1.00374, 1.00374, 1.10013, 1.10013, 1.10013, 1.10013, 1.10013, 1.10013, 1.10015, 1.10064, 1.10064, 1.10064, 1.10064, 1.10064, 1.10064, 1.10064, 1.10064, 1.10064, 1.10064, 1.10064, 1.10064, 1.10064, 1.10064, 1.10064, 1.10064, 1.10064, 1.10064, 1.10064, 1.10064, 1.10064, 1.10064, 1.10064, 1.10064, 1.10064, 1.10064, 1.10064, 1.10064, 1.10064, 1.10064, 1.10064, 1.10611, 1.10611, 1.10611, 1.10611, 1.10611, 1.10611, 1.10611, 1.10611, 1.10611, 1.10611, 1.10611, 1.10611, 1.10611, 1.10611, 1.10611, 1.10611, 1.10611, 1.10611, 1.10611, 1.10611, 1.10611, 2.10271, 2.10339, 2.10339, 2.10339, 2.10339, 2.10339, 2.10339, 2.10339, 2.10339, 2.10523, 2.10523, 2.10523, 2.10523, 2.10523, 2.10523, 2.10523, 2.10523, 2.10523, 2.10523, 2.10523, 2.10523, 2.10523, 2.10674, 2.10674, 3.10071, 3.10071, 3.10071, 3.10071, 3.10071, 3.10071, 3.10071, 3.10071, 3.10071, 3.10071, 3.10071, 3.10071, 3.10202, 3.10202, 3.10202, 3.10202, 3.10202, 3.10202, 3.10202, 3.10202, 3.10202, 3.10202, 3.10202, 3.10202, 3.10202, 3.10202, 3.10202, 3.10202, 3.10202, 3.10202, 3.10202, 3.10202, 3.10202, 3.10216, 3.10216, 3.10216, 3.10216, 3.10216, 3.10216, 3.10216, 3.10216, 3.10216, 3.10216, 3.10216, 3.10216, 3.10216, 3.10216, 3.10216, 3.10296, 3.10296, 3.10296, 3.10296, 3.10296, 3.10296, 3.10296, 3.10296, 3.10296, 3.10296, 3.10296, 3.10296, 3.10296, 3.10296, 3.10296, 5.10067, 5.10071, 5.10071, 5.10071, 5.10071, 5.10071, 5.10071, 5.10071, 5.10071, 5.10071, 5.10071), year = c(1962L, 1966L, 1971L, 1975L, 1984L, 1997L, 2001L, 2008L, 2012L, 1991L, 1995L, 1999L, 2004L, 2009L, 1963L, 1966L, 1971L, 1978L, 1988L, 2005L, 2012L, 1963L, 1967L, 1971L, 1975L, 1978L, 1982L, 1986L, 1990L, 1994L, 1999L, 2003L, 2007L, 2009L, 1959L, 1963L, 1973L, 1982L, 1988L, 2012L, 2012L, 1958L, 1959L, 1961L, 1963L, 1965L, 1967L, 1969L, 1971L, 1972L, 1973L, 1974L, 1975L, 1977L, 1979L, 1980L, 1981L, 1982L, 1983L, 1984L, 1985L, 1986L, 1987L, 1989L, 1990L, 1992L, 1994L, 1996L, 1998L, 2002L, 2006L, 2010L, 1960L, 1963L, 1964L, 1968L, 1969L, 1971L, 1973L, 1977L, 1978L, 1980L, 1984L, 1988L, 1991L, 1995L, 1998L, 2001L, 2003L, 2005L, 2007L, 2008L, 2011L, 1963L, 1975L, 1979L, 1982L, 1987L, 1995L, 2000L, 2004L, 2008L, 1963L, 1967L, 1971L, 1974L, 1978L, 1983L, 1987L, 1991L, 1995L, 1999L, 2002L, 2006L, 2010L, 1965L, 1984L, 1966L, 1970L, 1972L, 1973L, 1975L, 1979L, 1984L, 1989L, 1994L, 2001L, 2005L, 2009L, 1968L, 1972L, 1974L, 1976L, 1977L, 1979L, 1980L, 1982L, 1984L, 1985L, 1986L, 1988L, 1991L, 1993L, 1994L, 1997L, 1998L, 2002L, 2008L, 2009L, 2012L, 1967L, 1970L, 1974L, 1978L, 1982L, 1985L, 1989L, 1993L, 1997L, 2000L, 2004L, 2005L, 2006L, 2007L, 2011L, 1969L, 1972L, 1975L, 1979L, 1980L, 1983L, 1985L, 1989L, 1993L, 1997L, 2000L, 2002L, 2006L, 2008L, 2011L, 1972L, 1978L, 1982L, 1985L, 1988L, 1991L, 1995L, 1998L, 2002L, 2005L, 2011L), alk.mgl = c(33, 27, 20, 26, 14, 27, 51, 28, 26, 19, 20, 22, 27, 20, 78, 78, 68, 73, 71, 83, 73, 27, 19, 27, 18, 15, 12, 13, 15, 12, 30, 17, 12, 37, 38, 34, 34, 30, 36, 40, 62, 60, 68, 48, 66, 65, 56, 68, 48, 46, 50, 60, 70, 54, 56, 54, 76, 50, 24, 68, 62, 70, 80, 67, 71, 70, 62, 60, 61, 70, 77, 45, 46, 20, 56, 91, 50, 52, 46, 82, 54, 58, 82, 96, 86, 86, 99, 84, 86, 96, 67, 86, 99, 200, 175, 266, 256, 288, 280, 250, 202, 264, 142, 158, 150, 165, 182, 162, 148, 160, 158, 155, 150, 170, 160, 84, 68, 95, 58, 80, 116, 55, 55, 58, 36, 62, 60, 93, 80, 149, 159, 165, 164, 176, 150, 168, 154, 154, 166, 140, 148, 170, 160, 160, 155, 155, 163, 175, 155, 165, 145, 170, 190, 200, 164, 188, 188, 170, 130, 170, 160, 140, 150, 200, 170, 174, 182, 180, 197, 144, 154, 175, 180, 180, 178, 180, 180, 185, 185, 180, 70, 104, 137, 113, 133, 123, 147, 117, 101, 146, 125)), .Names = c("le", "year", "alk.mgl"), row.names = c(NA, -191L), class = "data.frame") ``` ```{r} # reactive stuff theworks <- reactive({ i <- input$location pick <- locs$loc == i j <- locs$le[pick] # map data infosub <- locs[pick, ] # chemistry data CHEMsub <- chem[chem$le==j, ] list(infosub=infosub, CHEMsub=CHEMsub) }) ``` ```{r} # server acm_defaults <- function(map, x, y) { addCircleMarkers(map, x, y, radius=6, color="black", fillColor="orange", fillOpacity=1, opacity=1, weight=2, stroke=TRUE, layerId="Selected") } # map output$Map <- renderLeaflet({ leaflet() %>% # Great Lakes centered setView(lng=-84, lat=45, zoom=6) %>% addTiles() %>% addCircleMarkers(data=locs, radius=6, color="black", label=~loc, stroke=FALSE, fillOpacity=0.5, group="locations", layerId=~loc) }) # update the map markers and view on map clicks observeEvent(input$Map_marker_click, { p <- input$Map_marker_click proxy <- leafletProxy("Map") if(p$id=="Selected"){ proxy %>% removeMarker(layerId="Selected") } else { proxy %>% setView(lng=p$lng, lat=p$lat, input$Map_zoom) %>% acm_defaults(p$lng, p$lat) } }) # update the location selectInput on map clicks observeEvent(input$Map_marker_click, { p <- input$Map_marker_click if(!is.null(p$id)) { if(is.null(input$location) || input$location!=p$id) { updateSelectInput(session, "location", selected=p$id) } } }) # update the map markers and view on location selectInput changes observeEvent(input$location, { p <- input$Map_marker_click p2 <- subset(locs, loc==input$location) proxy <- leafletProxy("Map") if(nrow(p2)==0) { proxy %>% removeMarker(layerId="Selected") } else { if(length(p$id) && input$location!=p$id) { proxy %>% setView(lng=p2$lon, lat=p2$lat, input$Map_zoom) %>% acm_defaults(p2$lon, p2$lat) } else { if(!length(p$id)) { proxy %>% setView(lng=p2$lon, lat=p2$lat, input$Map_zoom) %>% acm_defaults(p2$lon, p2$lat) } } } }) output$alk <- renderRbokeh({ df <- theworks()$CHEMsub if(dim(df)[1] > 0) { figure() %>% ly_points(df$year, df$alk.mgl) } else { return() } }) ``` ```{r} # ui fluidPage( fluidRow( column(4, h4(strong("Select stream from list or map")), selectInput("location", "", c("", locs$loc), selected=""), br(), h4("Alkalinity"), rbokehOutput("alk") ), column(7, p("(Hover to see identities of other streams.)"), leafletOutput("Map", width="510px", height="510px") ) ) ) ``` data.table

创建样本数据

由于没有提供可重现的示例,因此创建了一个示例数据集:

lubridate
library(data.table)
n_rows <- 5000L
n_days <- 365L*3L
set.seed(123L)
DT <- data.table(Post_Title = paste("Title", 1:n_rows),
                 Post_Day = as.Date("2014-01-01") + sample(0:n_days, n_rows, replace = TRUE),
                 Page_Views = round(abs(rnorm(n_rows, 500, 200))))[order(Post_Day)]
DT

绘制原始数据

如果没有聚合,可以通过

绘制数据
      Post_Title   Post_Day Page_Views
   1:   Title 74 2014-01-01        536
   2:  Title 478 2014-01-01        465
   3: Title 3934 2014-01-01        289
   4: Title 4136 2014-01-01        555
   5:  Title 740 2014-01-02        442
  ---                                 
4996: Title 1478 2016-12-31        586
4997: Title 2251 2016-12-31        467
4998: Title 2647 2016-12-31        468
4999: Title 3243 2016-12-31        498
5000: Title 4302 2016-12-31        309

enter image description here

按日汇总

library(ggplot2)
ggplot(DT) + aes(Post_Day, Page_Views) + geom_line()

要按日聚合,请使用ggplot(DT[, .(Page_Views = sum(Page_Views)), by = Post_Day]) + aes(Post_Day, Page_Views) + geom_line() 的分组参数by,并使用data.table作为聚合函数。聚合将数据点的数量从5000减少到1087.因此,情节看起来不那么复杂。

enter image description here

按月汇总

sum()

为了按月汇总,使用了分组参数ggplot(DT[, .(Page_Views = sum(Page_Views)), by = .(Post_Month = lubridate::floor_date(Post_Day, "month"))]) + aes(Post_Month, Page_Views) + geom_line() ,但此次by已映射到相应月份的第一天。因此,Post_Day成为2014-03-26的{​​{1}},仍然属于Post_Month类。这样,x轴保持连续,具有日期刻度。这样可以避免在使用2014-03-01POSIXct转换为因子(例如Post_Day时出现问题,其中x轴将变为离散。

enter image description here

答案 1 :(得分:0)

APRA$month <- as.factor(stftime(APRA$Post_Day, "%m")
APRA       <- APRA[order(as.numeric(APRA$month)),]

这会为您的数据创建一个月份列

z <- apply(split(APRA, APRA$month), function(x) {sum(as.numeric(APRA$Page_Views))})
z <- do.call(rbind, z)
z$month <- unique(APRA$month)
colnames(Z) <- c("Page_Views", "month")

这会创建一个z dataframe,每月有月份和页面浏览次数

现在绘制它

ggplot(z, aes(x = month, y = Page_Views)) + geom_line()

如果您正在寻找,请告诉我。我还没有编译它,请告诉它是否会引发一些错误。