如何在ggplot2 R

时间:2017-06-21 16:42:48

标签: r ggplot2

我希望在一段时间内绘制现场时间的平均值。我的数据集名为APRA,它有一个名为Post_Day的列,其中包含日期为POSIXct,以及一个名为Visit_Time_Per_Page_(Minutes)的列,这是一种num格式。

当我输入时:

ggplot(APRA,aes(Post_Day,mean(`Visit_Time_Per_Page_(Minutes)`)))+
  geom_line()+
  labs(title = "Time on Page over Time", x = "Date", y = "Time on Page (Minutes)")

我得到了回复:

enter image description here

我所追求的是随时间变化的日平均值。

感谢。

数据样本:

Post_Title  Post_Day    Visit_Time_Per_Page_(Minutes)
Title 1     2016-05-15  4.7
Title 2     2016-05-15  3.8
Title 3     2016-05-15  5.3
Title 4     2016-05-16  2.9
Title 5     2016-05-17  5.0
Title 6     2017-05-17  4.3
Title 7     2017-05-17  4.7
Title 8     2017-05-17  3.0
Title 9     2016-05-18  2.9
Title 10    2016-05-18  4.0
Title 11    2016-05-19  6.1
Title 12    2016-05-19  4.7
Title 13    2016-05-19  8.0
Title 14    2016-05-19  3.3

1 个答案:

答案 0 :(得分:0)

我通过将所有记录从2017更改为2016来更改输入数据,因为它更容易生成绘图作为示例。

关键是使用stat_summary函数并指定函数和geom。

# Load packages
library(dplyr)
library(ggplot2)
library(lubridate)

# Read the data
APRA <- read.table(text = "Post_Title Post_Day 'Visit_Time_Per_Page_(Minutes)'
'Title 1'    '2016-05-15'  4.7
'Title 2'     '2016-05-15'  3.8
'Title 3'     '2016-05-15'  5.3
'Title 4'     '2016-05-16'  2.9
'Title 5'     '2016-05-17'  5.0
'Title 6'    '2016-05-17'  4.3
'Title 7'     '2016-05-17'  4.7
'Title 8'     '2016-05-17'  3.0
'Title 9'     '2016-05-18'  2.9
'Title 10'    '2016-05-18'  4.0
'Title 11'    '2016-05-19'  6.1
'Title 12'    '2016-05-19'  4.7
'Title 13'    '2016-05-19'  8.0
'Title 14'    '2016-05-19'  3.3",
                 header = TRUE, stringsAsFactors = FALSE)

# Process and plot the data
APRA %>%
  mutate(Post_Day = ymd(Post_Day)) %>%
  ggplot(aes(x = Post_Day, y = Visit_Time_Per_Page_.Minutes.)) +
  geom_point() +
  # Calculate the mean based on y, set geom = line
  stat_summary(fun.y = "mean", colour = "red", size = 2, geom = "line")