熊猫在计算均值时保留原始列

时间:2019-07-16 18:32:57

标签: python python-3.x pandas dataframe

我的数据集df如下所示。这是一个基于minute的数据集

time, Open, High
2017-01-01 00:00:00, 1.2432, 1.1234
2017-01-01 00:01:00, 1.2432, 1.1234
2017-01-01 00:02:00, 1.2332, 1.1234
2017-01-01 00:03:00, 1.2132, 1.1234
...., ...., ....
2017-12-31 23:59:00, 1.2132, 1.1234

我执行了以下操作,从上述hourly数据集中计算出minute的平均值

df['time'] = pd.to_datetime(df['time'])
df.index = df['time']
df_mean = df.resample('H').mean()

然后我加载df_mean并得到hourly值:

time,                 Open       High   
2017-01-01 00:00:00 1.051488    1.051500     
2017-01-01 01:00:00 1.051247    1.051275     
2017-01-01 02:00:00 1.051890    1.051957     
2017-01-01 03:00:00 1.051225    1.051290     
...., ...., ....
2017-12-31 23:00:00 1.051225    1.051290

但是我还想要原始的Open值和High

我需要两件事的帮助:

  • 一旦计算出平均值,我希望将其标记为Open_MeanHigh_Mean
  • 由于小时均值以time为基础给出hour(例如:2017-01-01 01:00:00),因此我想在此加载原始的OpenHigh值时间。
  

此处

     
      
  • OpenHigh的值对于形成原始timestamp的特定dataset来说是相同的,但是Open_MeanHigh_Mean是   为该mean计算每小时timestamp
  •   

新的df应该如下所示:

time,                 Open          High     Open_Mean   High_Mean    
2017-01-01 00:00:00   1.051488    1.051500   1.051500    1.051500 
2017-01-01 01:00:00   1.051247    1.051275   1.051500    1.051500
2017-01-01 02:00:00   1.051890    1.051957   1.051500    1.051500  
2017-01-01 03:00:00   1.051225    1.051290   1.051500    1.051500  
...., ...., ...., ...., ....
2017-12-31 23:00:00   1.051225    1.051290   1.051500    1.051500

一旦我们在dataset中获得正确的df,我想过滤新的df以仅加载time个特定数据。

例如:每天加载time range10 PM - 4 PM的数据。目前,它会加载所有时间。

1 个答案:

答案 0 :(得分:1)

使用add_suffix重命名

df['time'] = pd.to_datetime(df['time'])

df_mean = df.set_index('time').resample('H').mean()

df_mean = df_mean.add_suffix('_Mean')

mergehow='inner'一起提取小时数据:

df.merge(df_mean, left_on='time', right_index=True, how='inner')

输出(随机数据的开头);

                    time      Open      High  Open_Mean  High_Mean
0    2017-01-01 00:00:00  1.219690  1.693049   1.519751   2.042550
60   2017-01-01 01:00:00  1.688490  1.404521   1.526833   2.115046
120  2017-01-01 02:00:00  1.015285  2.653544   1.533529   1.797564
180  2017-01-01 03:00:00  1.357672  2.299571   1.506012   2.043484
240  2017-01-01 04:00:00  1.293786  2.312414   1.489759   2.131644
300  2017-01-01 05:00:00  1.040048  2.791968   1.438353   1.816585
360  2017-01-01 06:00:00  1.225080  1.505802   1.473208   2.193237
420  2017-01-01 07:00:00  1.145402  3.217261   1.481710   1.914683