根据条件获取上个月的平均值

时间:2019-01-31 10:11:10

标签: r dataframe merge dplyr

我有2个数据框,“东方”包含某些日期的拍卖数据,另一个“ monthly_agg”包含这些拍卖的月平均值。 我想以一种方式合并数据帧,使所述月份的平均拍卖价格为上个月的平均拍卖价格。

对于东部数据框,我从拍卖日期中提取了月份和年份,然后将月份-年份连接起来以形成一个新列。

东部数据集:

# open your file as f
lines = f.readlines()
for i, line in enumerate(lines):
    if "ERROR" in line:
        print(lines[i+1])
        print(lines[i+2])
        # Exit or something you want to do.

然后使用left_join()合并了两个数据框,并创建了一个新的数据框,名为Eastern1。

合并数据帧“ eastern1”后,其结构如下:

    date       month  year   concat      
    2014-10-17 10     2014  10 - 2014   
    2014-10-24 10     2014  10 - 2014
    2014-10-31 10     2014  10 - 2014   
    2014-11-07 11     2014  11 - 2014   
    2014-11-17 11     2014  11 - 2014   
    2014-11-26 11     2014  11 - 2014   
    2014-12-26 12     2014  12 - 2014
    2015-01-22 1      2015  1-2015


For the monthly_agg data frame, I have calculated the monthly averages for the month-year combination.

monthly_agg data-set:

date       month year   concat      prev_avgL1
2014-10-17 10     2014  10 - 2014     avg10
2014-10-24 10     2014  10 - 2014     avg10
2014-10-31 10     2014  10 - 2014     avg10
2014-11-07 11     2014  11 - 2014     avg11
2014-11-17 11     2014  11 - 2014     avg11
2014-11-26 11     2014  11 - 2014     avg11
2014-12-26 12     2014  12 - 2014     avg12
2015-01-22 1      2015  1-2015        avg1(for the new year and new month)

谢谢!

1 个答案:

答案 0 :(得分:1)

在串联之前,我首先通过lubridate包中的%m-%函数从日期中减去一个月,从而获得了所需的结果。

请参阅文档here

library(lubridate)
library(dplyr)

eastern <- data.frame(date = c("2014-10-17" , "2014-10-24", "2014-10-31", 
                               "2014-11-07", "2014-11-17", "2014-11-26", 
                               "2014-12-26", "2015-01-22", "2015-02-12")) %>%
           mutate(date = as.Date(date),
                  year = year(date %m-% months(1)),
                  month = month(date %m-% months(1)),
                  concat = paste(year, "-", month))

        date year month    concat
1 2014-10-17 2014     9  2014 - 9
2 2014-10-24 2014     9  2014 - 9
3 2014-10-31 2014     9  2014 - 9
4 2014-11-07 2014    10 2014 - 10
5 2014-11-17 2014    10 2014 - 10
6 2014-11-26 2014    10 2014 - 10
7 2014-12-26 2014    11 2014 - 11
8 2015-01-22 2014    12 2014 - 12
9 2015-02-12 2015     1  2015 - 1

如果您与此一起(monthly_agg或aggs或其他内容)加入

    avg    concat
1 avg10 2014 - 10
2 avg11 2014 - 11
3 avg12 2014 - 12
4  avg1  2015 - 1
5  avg2  2015 - 2

您会得到的

left_join(eastern[, c("date", "concat")], aggs, by = "concat")

        date    concat   avg
1 2014-10-17  2014 - 9  <NA>
2 2014-10-24  2014 - 9  <NA>
3 2014-10-31  2014 - 9  <NA>
4 2014-11-07 2014 - 10 avg10
5 2014-11-17 2014 - 10 avg10
6 2014-11-26 2014 - 10 avg10
7 2014-12-26 2014 - 11 avg11
8 2015-01-22 2014 - 12 avg12
9 2015-02-12  2015 - 1  avg1

这样做对您有用吗?