对具有不同索引频率的数据帧进行操作

时间:2014-07-14 18:50:58

标签: python pandas dataframe

我有一个每小时的数据框,例如df1:

                  v1  v2
date
2015-01-01 0      20  25
2015-01-01 1      30  35
.
.
2015-02-01 0      45  55
2015-02-01 1      22  32 

我有一个月份的数据名称,例如df2:

                   v1     
date
2015-01-01         10    
2015-02-01         20

我想根据月度值从df1中减去df2,结果应该是与df1具有相同粒度的数据帧,例如:

                  v1  v2
date
2015-01-01 0      10  15
2015-01-01 1      20  25
.
.
2015-02-01 0      25  35
2015-02-01 1       2  12 

提前感谢您的帮助。

1 个答案:

答案 0 :(得分:1)

我合并数据帧并进行减法

import pandas as pd
from StringIO import StringIO

# create example data

d1 = '''date      v1  v2
2015-01-01 0      20  25
2015-01-01 1      30  35
2015-02-01 0      45  55
2015-02-01 1      22  32'''

d2 = '''date      v1     
2015-01-01         10    
2015-02-01         20'''

df1 = pd.DataFrame.from_csv(StringIO(d1), sep='\s{2,}', parse_dates='date')
df1.index = pd.to_datetime( df1.index, format='%Y-%m-%d %H')

df2 = pd.DataFrame.from_csv(StringIO(d2), sep='\s+', parse_dates='date')

# start

df1['datetime'] = df1.index # to keep index in that column during merging

print '\n--- merge ---\n'

df = pd.merge( df1, df2, left_on=df1.index.date, right_on=df2.index.date)

print df

print '\n--- substract ---\n'

df['v1_x'] = df['v1_x'] - df['v1_y']
df['v2'] = df['v2'] - df['v1_y']

print df

结果

--- merge ---

        key_0  v1_x  v2            datetime  v1_y
0  2015-01-01    20  25 2015-01-01 00:00:00    10
1  2015-01-01    30  35 2015-01-01 01:00:00    10
2  2015-02-01    45  55 2015-02-01 00:00:00    20
3  2015-02-01    22  32 2015-02-01 01:00:00    20

--- substract ---

        key_0  v1_x  v2            datetime  v1_y
0  2015-01-01    10  15 2015-01-01 00:00:00    10
1  2015-01-01    20  25 2015-01-01 01:00:00    10
2  2015-02-01    25  35 2015-02-01 00:00:00    20
3  2015-02-01     2  12 2015-02-01 01:00:00    20