使用Pandas DataFrame计算每日回报

时间:2013-11-15 12:06:31

标签: python python-3.x pandas

这是我的Pandas数据框:

prices = pandas.DataFrame([1035.23, 1032.47, 1011.78, 1010.59, 1016.03, 1007.95, 
              1022.75, 1021.52, 1026.11, 1027.04, 1030.58, 1030.42,
              1036.24, 1015.00, 1015.20])

这是我的daily_return功能:

def daily_return(prices):
    return prices[:-1] / prices[1:] - 1

以下是来自此功能的输出:

0    NaN
1      0
2      0
3      0
4      0
5      0
6      0
7      0
8      0
9      0
10     0
11     0
12     0
13     0
14   NaN

为什么我有这个输出?

3 个答案:

答案 0 :(得分:44)

为什么不默认使用pct_change提供的非常方便的 pandas 方法:

import pandas as pd

prices = pandas.DataFrame([1035.23, 1032.47, 1011.78, 1010.59, 1016.03, 1007.95, 
          1022.75, 1021.52, 1026.11, 1027.04, 1030.58, 1030.42,
          1036.24, 1015.00, 1015.20])

daily_return = prices.pct_change(1) # 1 for ONE DAY lookback
monthly_return = prices.pct_change(21) # 21 for ONE MONTH lookback
annual_return = prices.pct_change(252) # 252 for ONE YEAR lookback

原始 prices

print(prices)
          0                                                                    
0   1035.23                                                                    
1   1032.47                                                                    
2   1011.78                                                                    
3   1010.59                                                                    
4   1016.03                                                                    
5   1007.95                                                                    
6   1022.75                                                                    
7   1021.52                                                                    
8   1026.11                                                                    
9   1027.04                                                                    
10  1030.58                                                                    
11  1030.42                                                                    
12  1036.24                                                                    
13  1015.00                                                                    
14  1015.20                                                                    

每日回报率 prices.pct_change(1)

print(prices.pct_change(1))
           0                                                                   
0        NaN                                                                   
1  -0.002666                                                                   
2  -0.020039                                                                   
3  -0.001176                                                                   
4   0.005383                                                                   
5  -0.007953                                                                   
6   0.014683                                                                   
7  -0.001203                                                                   
8   0.004493                                                                   
9   0.000906                                                                   
10  0.003447                                                                   
11 -0.000155                                                                   
12  0.005648                                                                   
13 -0.020497                                                                   
14  0.000197 

答案 1 :(得分:19)

因为操作将在索引上进行对齐,所以您可以将其中一个DataFrame转换为数组:

prices[:-1].values / prices[1:] - 1

prices[:-1] / prices[1:].values - 1

取决于您想要的结果索引。

或使用shift()方法:

prices.shift(1) / prices - 1

prices / prices.shift(1) - 1

答案 2 :(得分:0)

@YaOzl的回答只是一些补充,以防万一有人读过。 如果您的退货数据是包含几只股票的面板电子表格:

>>> prices = pandas.DataFrame(
{"StkCode":["StockA","StockA","StockA","StockA","StockA","StockB","StockB","StockB","StockB","StockB","StockC","StockC","StockC","StockC","StockC",], 
"Price":[1035.23, 1032.47, 1011.78, 1010.59, 1016.03, 1007.95, 1022.75, 1021.52, 1026.11, 1027.04, 1030.58, 1030.42, 1036.24, 1015.00, 1015.20]}
)

哪个给你:

      Price StkCode
0   1035.23  StockA
1   1032.47  StockA
2   1011.78  StockA
3   1010.59  StockA
4   1016.03  StockA
5   1007.95  StockB
6   1022.75  StockB
7   1021.52  StockB
8   1026.11  StockB
9   1027.04  StockB
10  1030.58  StockC
11  1030.42  StockC
12  1036.24  StockC
13  1015.00  StockC
14  1015.20  StockC

然后,您可以将 .pct_change(k) .groupby(StkCode)一起使用。 而且比使用迭代器快了很多……(我尝试了我的数据集,成功地将处理时间从10小时缩短到20秒!)

>>> prices["Return"] = prices.groupby("StkCode")["Price"].pct_change(1)

给你:

      Price StkCode    Return
0   1035.23  StockA       NaN
1   1032.47  StockA -0.002666
2   1011.78  StockA -0.020039
3   1010.59  StockA -0.001176
4   1016.03  StockA  0.005383
5   1007.95  StockB       NaN
6   1022.75  StockB  0.014683
7   1021.52  StockB -0.001203
8   1026.11  StockB  0.004493
9   1027.04  StockB  0.000906
10  1030.58  StockC       NaN
11  1030.42  StockC -0.000155
12  1036.24  StockC  0.005648
13  1015.00  StockC -0.020497
14  1015.20  StockC  0.000197