Calculate mean based on time horizons

时间:2017-09-19 21:23:26

标签: r dataframe range subset mean

considering the following data.frame I would like to calculate the mean between 2011-01-03 and 2011-01-06:

             GOOG.Open GOOG.High GOOG.Low GOOG.Close GOOG.Volume
2011-01-03    297.94    302.49   297.94     301.87          NA
2011-01-04    302.51    302.79   299.76     300.76          NA
2011-01-05    299.73    304.86   299.72     304.23          NA
2011-01-06    305.03    308.91   304.72     306.44          NA

The result of the code mean(data$GOOG.Open, seq(from=01/03/11, to=01/06/11)) gives me 529.8661 and is actually referencing to different values in the Data Frame. Do you know how to calculate the mean?

1 个答案:

答案 0 :(得分:0)

首先,您需要定义存储数据的方式,请参阅:How to make a great R reproducible example?

我在dplyr包中使用tidyverse来分析数据,并使用lubridate来定义日期格式。这假设您希望能够改变平均日期。

library(tidyverse)
library(lubridate)

dat <- data.frame(date = c('2011-01-03','2011-01-04','2011-01-05','2011-01-06'), 
                  GOOG.Open = c(297.94,302.51,299.73,305.03))
dat %>% 
    mutate(date = format(ymd(date))) %>% 
    filter(date>='2011-01-03' & date<='2011-01-06') %>% 
    summarise(goog_mean = mean(GOOG.Open))

如果您只想要提供数据的平均值,可以使用:

mean(dat$GOOG.Open) 

dat %>% 
    summarise(mean = mean(GOOG.Open))