pct_change在Pandas中的2列之间,具有行偏移量

时间:2019-05-08 13:44:54

标签: python pandas

我的数据框如下:

               Date_Time   Open  Close
0    2004-05-10 16:00:00  12.88  12.54
1    2004-05-11 16:00:00  12.87  12.68
2    2004-05-12 16:00:00  12.79  12.88
3    2004-05-13 16:00:00  12.84  12.88
4    2004-05-14 16:00:00  12.64  12.88
5    2004-05-17 16:00:00  12.72  12.68

我需要做的是计算一行的Close下一个(不是同一行)的Open之间的变化!)。这应该从第0行开始,以便第5行应包含NaN。像这样(使用占位符值):

               Date_Time   Open  Close  Overnight_change
0    2004-05-10 16:00:00  12.88  12.54  123
1    2004-05-11 16:00:00  12.87  12.68  123
2    2004-05-12 16:00:00  12.79  12.88  123
3    2004-05-13 16:00:00  12.84  12.88  123
4    2004-05-14 16:00:00  12.64  12.88  123
5    2004-05-17 16:00:00  12.72  12.68  NaN

我正在尝试:

overnight_change = (csv_data['Open'].loc[1:] - csv_data['Close']) / csv_data['Close']
df.assign(overnight_change=overnight_change)

但是,这给出了:

               Date_Time   Open  Close  Overnight_change
0    2004-05-10 16:00:00  12.88  12.54  NaN
1    2004-05-11 16:00:00  12.87  12.68  123
2    2004-05-12 16:00:00  12.79  12.88  123
3    2004-05-13 16:00:00  12.84  12.88  123
4    2004-05-14 16:00:00  12.64  12.88  123
5    2004-05-17 16:00:00  12.72  12.68  123

如何抵消分配操作?还是有其他更好的方法呢?

我也曾尝试致电csv_data['Open'].loc[1:].reset_index,但这给了:

  

ValueError:传递了错误的项目数3776,放置意味着1

2 个答案:

答案 0 :(得分:2)

使用Series.shift

overnight_change = (df['Open'].shift(-1) - df['Close']) / df['Close']
df = df.assign(overnight_change=overnight_change)
print (df)
             Date_Time   Open  Close  overnight_change
0  2004-05-10 16:00:00  12.88  12.54          0.026316
1  2004-05-11 16:00:00  12.87  12.68          0.008675
2  2004-05-12 16:00:00  12.79  12.88         -0.003106
3  2004-05-13 16:00:00  12.84  12.88         -0.018634
4  2004-05-14 16:00:00  12.64  12.88         -0.012422
5  2004-05-17 16:00:00  12.72  12.68               NaN

或者:

#store shifted data to Series for only once run shift
c = df['Close'].shift(-1)
overnight_change = (df['Open'] - c) / c
df = df.assign(overnight_change=overnight_change)
print (df)
             Date_Time   Open  Close  overnight_change
0  2004-05-10 16:00:00  12.88  12.54          0.015773
1  2004-05-11 16:00:00  12.87  12.68         -0.000776
2  2004-05-12 16:00:00  12.79  12.88         -0.006988
3  2004-05-13 16:00:00  12.84  12.88         -0.003106
4  2004-05-14 16:00:00  12.64  12.88         -0.003155
5  2004-05-17 16:00:00  12.72  12.68               NaN

答案 1 :(得分:1)

您可以进行this或移位所得的系列

df['overnight_change']=df['overnight_change'].shift(-1)