我对R比较陌生 我有一个大型数据集,该数据集提供了当天的多个数据值。因此,为简化起见,我需要在一张显示日期和平均值的表格中获得每天的平均值。
Date_Recorded Value
2016-08-19 74.2
2016-08-19 74.6
2016-08-20 85.63
2016-08-20 88.55
我希望结果表看起来像这样
Date_Recorded Value
2016-08-19 74.4
2016-08-20 87.09
之后,我该如何从该数据集或任何其他数据集中提取日期范围为2016-08-20至2018-02-04的数据呢?
答案 0 :(得分:0)
我们可以使用R中基本统计的一部分进行汇总...
Date_Recorded<-c(
"2016-08-19",
"2016-08-19",
"2016-08-20",
"2016-08-20")
Value<-c(
74.2,
74.6,
85.63,
88.55
)
df<-data.frame(Date_Recorded,Value)
df$Date_Recorded<-as.Date(df$Date_Recorded)
test_df<-aggregate(df["Value"], by=df["Date_Recorded"], FUN=mean)
> test_df
Date_Recorded Value
1 2016-08-19 74.40
2 2016-08-20 87.09
# As pointed out by @Sotos
start_date<-as.Date("2016-08-18")
end_date<-as.Date("2016-08-19")
test_df[test_df$Date_Recorded >= start_date & test_df$Date_Recorded <=
end_date, ]
Date_Recorded Value
1 2016-08-19 74.4
此问题的后半部分@@ Sotos。
答案 1 :(得分:0)
Chabo的好答案。或者,您可以使用tidyverse方法:
library(tidyverse)
Date_Recorded<-c("2016-08-19", "2016-08-19", "2016-08-20",
"2016-08-20", "2016-08-21", "2016-08-21")
Value <- c(74.2, 74.6, 85.63,
88.55, 70.1, 70.2)
df<-data.frame(Date_Recorded,Value)
df$Date_Recorded<-as.Date(df$Date_Recorded)
# To create the resulting table you wanted
df %>%
group_by(Date_Recorded) %>%
summarise(mean(Value, na.rm = FALSE))
# Or to search for a date range. You could use filter(Date_Recorded == "2018-10-02") to
# serach for a single date
df %>%
filter(Date_Recorded >= "2016-08-20" & Date_Recorded <= "2016-08-21") %>% #to select a date range
group_by(Date_Recorded) %>%
summarise(mean(Value, na.rm = FALSE))