使用索引将一个数据框的值映射到另一数据框列

时间:2019-09-23 06:50:36

标签: python pandas datetime

我有一个数据框df1,它具有三列(目标,季节性和季节性指数)。季节性模式每隔7点重复一次。季节性的最后一个索引是2。我有另一个数据框df2,该数据框预测有10行的列。现在,我想在df2中创建新列,该列将是df1的预测列和季节性列的总和。这里的要点是映射。我想使用季节性索引列和相应的季节性列将值添加到预测列。例如:第4个季节性指数的季节性值应添加到预测列的第一个元素中。用尽第6个索引值后,随着季节性在7点之后重复出现,相加应从零开始。

df1

                       Target   Seasonality Seasonality_index
Datetime            
2019-01-01 00:00:00     0.44    0.12    0
2019-01-01 01:00:00     0.44    0.06    1
2019-01-01 02:00:00     0.43    0.01    2
2019-01-01 03:00:00     0.43    -0.04   3
2019-01-01 04:00:00     0.43    -0.09   4
2019-01-01 05:00:00     0.43    -0.10   5
2019-01-01 06:00:00     0.42    -0.13   6
2019-01-01 07:00:00     0.42    0.12    0
2019-01-01 08:00:00     0.42    0.06    1
2019-01-01 09:00:00     0.43    0.01    2


df2

       Datetime         forecasted   Expected_output
    2019-01-01 10:00:00 7.21         7.21 -(-0.04) #4th element
    2019-01-01 11:00:00 7.20         7.20 -(-0.09) #5th element
    2019-01-01 12:00:00 7.19         7.19 -(-0.10) #6th element
    2019-01-01 13:00:00 7.18         7.18 -(-0.13) #7th element
    2019-01-01 14:00:00 7.19         7.19 -(0.12) #1st element
    2019-01-01 15:00:00 7.19         7.19 -(0.06) #2nd element
    2019-01-01 16:00:00 7.20         7.20 -(-0.10) #3rd element
    2019-01-01 17:00:00 7.20         7.20 -(-0.04) #4th element
    2019-01-01 18:00:00 7.21         7.21 -(-0.09) #5th element
    2019-01-01 19:00:00 7.20         7.20 -(-0.10) #6th element

1 个答案:

答案 0 :(得分:1)

我相信您可以使用:

repeat = df['Seasonality_index'].max() + 1

#first convert first group values to list
a = df1['Seasonality'].tolist()[:repeat]
print (a)
[0.12, 0.06, 0.01, -0.04, -0.09, -0.1, -0.13]

#reorder values by constant
first = df['Seasonality_index'].iat[-1] + 1
b= a[first:] + a[:first]
print (b)
[-0.04, -0.09, -0.1, -0.13, 0.12, 0.06, 0.01]

#repeat values by length of df2
arr = np.tile(b, int(len(df2) // repeat) + repeat)
#assign by length of df2
df2['test'] = arr[:len(df2)]
df2['Expected_output'] = df2['forecasted']  - arr[:len(df2)]

print (df2)
                     forecasted  Expected_output  test
Datetime                                              
2019-01-01 10:00:00        7.21             7.25 -0.04
2019-01-01 11:00:00        7.20             7.29 -0.09
2019-01-01 12:00:00        7.19             7.29 -0.10
2019-01-01 13:00:00        7.18             7.31 -0.13
2019-01-01 14:00:00        7.19             7.07  0.12
2019-01-01 15:00:00        7.19             7.13  0.06
2019-01-01 16:00:00        7.20             7.19  0.01
2019-01-01 17:00:00        7.20             7.24 -0.04
2019-01-01 18:00:00        7.21             7.30 -0.09
2019-01-01 19:00:00        7.20             7.30 -0.10