重新采样每日到每周数据

时间:2021-01-20 02:55:11

标签: python pandas pandas-resample

有很多关于这个主题的帖子。我浏览了它们,但找不到问题的答案:

我正在研究 Pandas 时间序列 DataFrame。 DataFrame 数据位于每日时间范围内,我通过 Pandas 库 resample() 函数将其聚合到每周时间范围,如下所示。

daily_df = #daily time series dataframe

def aggregate(daily_df, frequency): 
    weekly_df = daily_df.resample(frequency, on='date').agg({'open':'first','high':'max', 'low':'min','close':'last','volume':'sum'})
    df.reset_index(inplace=True)
    return weekly_df

weekly_df = aggregate(daily_df, 'W-Fri')

我遇到的问题是,某个星期的时间序列数据只包含周一到周四的数据,但我不知道如何告诉 resample() 函数检查它,如果确实如此,在星期四而不是星期五结束一周; “W-周五”。

1 个答案:

答案 0 :(得分:0)

由于 resample 函数没有该功能,我们可以通过添加天数标志并进行统计来确定一周内重采样的天数。

import yfinance as yf
daily_df = yf.download("AAPL", start="2020-11-01", end="2020-12-31")

def aggregate(daily_df, frequency):
    daily_df.reset_index(inplace=True)
    daily_df['days'] = 1
    weekly_df = daily_df.resample(frequency, on='Date').agg({'Open':'first','High':'max', 'Low':'min','Close':'last','Volume':'sum','days':'count'})
    return weekly_df

weekly_df = aggregate(daily_df, 'W-Fri')
weekly_df

          Open  High     Low     Close    Volume    days
Date                        
2020-11-06  109.110001  119.620003  107.320000  118.690002  609571800   5
2020-11-13  120.500000  121.989998  114.129997  119.260002  589577900   5
2020-11-20  118.919998  120.989998  116.809998  117.339996  389493400   5
2020-11-27  117.180000  117.620003  112.589996  116.589996  365024000   4
2020-12-04  116.970001  123.779999  116.809998  122.250000  543809200   5
2020-12-11  122.309998  125.949997  120.150002  122.410004  452278700   5
2020-12-18  122.599998  129.580002  121.540001  126.660004  621866700   5
2020-12-25  125.019997  134.410004  123.449997  131.970001  433310200   4
2021-01-01  133.990005  138.789993  133.399994  133.720001  341985600   3