计算R中特定日期范围的平均值

时间:2018-07-02 11:49:45

标签: r date range average

我有两个数据框:

Date <- seq(as.Date("2013/1/1"), by = "day", length.out = 17)
x <-data.frame(Date)
x$discharge <- c("1000","1100","1200","1300","1400","1200","1300","1300","1200","1100","1200","1200","1100","1400","1200","1100","1400")
x$discharge <- as.numeric(x$discharge)

并且:

Date2 <- c("2013-01-01","2013-01-08","2013-01-12","2013-01-17")
y <- data.frame(Date2)
y$concentration <- c("1.5","2.5","1.5","3.5")
y$Date2 <- as.Date(y$Date2)
y$concentration <- as.numeric(y$concentration)

我拼命想做的是以下几点:

  1. 在数据框y中,首次测量是针对2013年1月1日至2013年1月7日的时间段
  2. 在数据帧x中计算该时段的平均放电量
  3. 将平均放电返回到第一次测量旁边的新列中的数据框y,然后继续下一次测量

我正在研究诸如dplyrapply之类的功能,但无法弄清楚。

1 个答案:

答案 0 :(得分:2)

library(dplyr)
x %>% 
  mutate(period = cut(as.Date(Date), c(as.Date("1900-01-01"), as.Date(y$Date2[-1]), as.Date("2100-01-01")), c(1:length(y$Date2)))) %>% 
  group_by(period) %>% 
  mutate(meandischarge = mean(discharge, na.rm = T)) %>% 
  right_join(y, by = c("Date" = "Date2"))

        Date discharge period meandischarge concentration
      <date>     <dbl> <fctr>         <dbl>         <dbl>
1 2013-01-01      1000      1      1214.286           1.5
2 2013-01-08      1300      2      1200.000           2.5
3 2013-01-12      1200      3      1200.000           1.5
4 2013-01-17      1400      4      1400.000           3.5

如果只想要原始的y变量,则可以执行以下操作:

x %>% 
  mutate(period = cut(as.Date(Date), c(as.Date("1900-01-01"), as.Date(y$Date2[-1]), as.Date("2100-01-01")), c(1:length(y$Date2)))) %>% 
  group_by(period) %>% 
  mutate(meandischarge = mean(discharge, na.rm = T)) %>% 
  ungroup() %>% 
  right_join(y, by = c("Date" = "Date2")) %>% 
  select(Date2 = Date, concentration, meandischarge)
       Date2 concentration meandischarge
      <date>         <dbl>         <dbl>
1 2013-01-01           1.5      1214.286
2 2013-01-08           2.5      1200.000
3 2013-01-12           1.5      1200.000
4 2013-01-17           3.5      1400.000