considering the following data.frame I would like to calculate the mean between 2011-01-03 and 2011-01-06:
GOOG.Open GOOG.High GOOG.Low GOOG.Close GOOG.Volume
2011-01-03 297.94 302.49 297.94 301.87 NA
2011-01-04 302.51 302.79 299.76 300.76 NA
2011-01-05 299.73 304.86 299.72 304.23 NA
2011-01-06 305.03 308.91 304.72 306.44 NA
The result of the code mean(data$GOOG.Open, seq(from=01/03/11, to=01/06/11))
gives me 529.8661 and is actually referencing to different values in the Data Frame. Do you know how to calculate the mean?
答案 0 :(得分:0)
首先,您需要定义存储数据的方式,请参阅:How to make a great R reproducible example?
我在dplyr
包中使用tidyverse
来分析数据,并使用lubridate
来定义日期格式。这假设您希望能够改变平均日期。
library(tidyverse)
library(lubridate)
dat <- data.frame(date = c('2011-01-03','2011-01-04','2011-01-05','2011-01-06'),
GOOG.Open = c(297.94,302.51,299.73,305.03))
dat %>%
mutate(date = format(ymd(date))) %>%
filter(date>='2011-01-03' & date<='2011-01-06') %>%
summarise(goog_mean = mean(GOOG.Open))
如果您只想要提供数据的平均值,可以使用:
mean(dat$GOOG.Open)
或
dat %>%
summarise(mean = mean(GOOG.Open))