我的数据集df
如下所示。这是一个基于minute
的数据集
time, Open, High
2017-01-01 00:00:00, 1.2432, 1.1234
2017-01-01 00:01:00, 1.2432, 1.1234
2017-01-01 00:02:00, 1.2332, 1.1234
2017-01-01 00:03:00, 1.2132, 1.1234
...., ...., ....
2017-12-31 23:59:00, 1.2132, 1.1234
我执行了以下操作,从上述hourly
数据集中计算出minute
的平均值
df['time'] = pd.to_datetime(df['time'])
df.index = df['time']
df_mean = df.resample('H').mean()
然后我加载df_mean
并得到hourly
值:
time, Open High
2017-01-01 00:00:00 1.051488 1.051500
2017-01-01 01:00:00 1.051247 1.051275
2017-01-01 02:00:00 1.051890 1.051957
2017-01-01 03:00:00 1.051225 1.051290
...., ...., ....
2017-12-31 23:00:00 1.051225 1.051290
但是我还想要原始的Open
值和High
值
我需要两件事的帮助:
Open_Mean
和High_Mean
time
为基础给出hour
(例如:2017-01-01 01:00:00
),因此我想在此加载原始的Open
和High
值时间。此处:
Open
和High
的值对于形成原始timestamp
的特定dataset
来说是相同的,但是Open_Mean
和High_Mean
是 为该mean
计算每小时timestamp
新的df
应该如下所示:
time, Open High Open_Mean High_Mean 2017-01-01 00:00:00 1.051488 1.051500 1.051500 1.051500 2017-01-01 01:00:00 1.051247 1.051275 1.051500 1.051500 2017-01-01 02:00:00 1.051890 1.051957 1.051500 1.051500 2017-01-01 03:00:00 1.051225 1.051290 1.051500 1.051500 ...., ...., ...., ...., .... 2017-12-31 23:00:00 1.051225 1.051290 1.051500 1.051500
一旦我们在dataset
中获得正确的df
,我想过滤新的df
以仅加载time
个特定数据。
例如:每天加载time range
表10 PM - 4 PM
的数据。目前,它会加载所有时间。
答案 0 :(得分:1)
使用add_suffix
重命名
df['time'] = pd.to_datetime(df['time'])
df_mean = df.set_index('time').resample('H').mean()
df_mean = df_mean.add_suffix('_Mean')
和merge
与how='inner'
一起提取小时数据:
df.merge(df_mean, left_on='time', right_index=True, how='inner')
输出(随机数据的开头);
time Open High Open_Mean High_Mean
0 2017-01-01 00:00:00 1.219690 1.693049 1.519751 2.042550
60 2017-01-01 01:00:00 1.688490 1.404521 1.526833 2.115046
120 2017-01-01 02:00:00 1.015285 2.653544 1.533529 1.797564
180 2017-01-01 03:00:00 1.357672 2.299571 1.506012 2.043484
240 2017-01-01 04:00:00 1.293786 2.312414 1.489759 2.131644
300 2017-01-01 05:00:00 1.040048 2.791968 1.438353 1.816585
360 2017-01-01 06:00:00 1.225080 1.505802 1.473208 2.193237
420 2017-01-01 07:00:00 1.145402 3.217261 1.481710 1.914683