我有一个数据框,其结构如下:
Date ticker adj_close
0 2016-11-21 AAPL 111.730
1 2016-11-22 AAPL 111.800
2 2016-11-23 AAPL 111.230
3 2016-11-25 AAPL 111.790
4 2016-11-28 AAPL 111.570
...
8 2016-11-21 ACN 119.680
9 2016-11-22 ACN 119.480
10 2016-11-23 ACN 119.820
11 2016-11-25 ACN 120.740
...
如果我计算以下等式:
TimeSeriesLogReturns = np.log(GetTimeSeriesLevels['adj_close']/GetTimeSeriesLevels['adj_close'].shift(1))
目前,对整个列表进行了计算,并且来自两个不同的股票代码的数据是混合的,不应该是这种情况。所以我想让计算代码依赖。
答案 0 :(得分:1)
试试这个:
In [91]: df['new'] = df.groupby('ticker')['adj_close'].apply(lambda x: x.div(x.shift(1)))
In [92]: df
Out[92]:
Date ticker adj_close new
0 2016-11-21 AAPL 111.73 NaN
1 2016-11-22 AAPL 111.80 1.000627
2 2016-11-23 AAPL 111.23 0.994902
3 2016-11-25 AAPL 111.79 1.005035
4 2016-11-21 ACN 119.68 NaN
5 2016-11-22 ACN 119.48 0.998329
6 2016-11-23 ACN 119.82 1.002846
7 2016-11-25 ACN 120.74 1.007678
In [93]: df['log'] = np.log(df.groupby('ticker')['adj_close'].apply(lambda x: x.div(x.shift(1))))
In [94]: df
Out[94]:
Date ticker adj_close new log
0 2016-11-21 AAPL 111.73 NaN NaN
1 2016-11-22 AAPL 111.80 1.000627 0.000626
2 2016-11-23 AAPL 111.23 0.994902 -0.005111
3 2016-11-25 AAPL 111.79 1.005035 0.005022
4 2016-11-21 ACN 119.68 NaN NaN
5 2016-11-22 ACN 119.48 0.998329 -0.001673
6 2016-11-23 ACN 119.82 1.002846 0.002842
7 2016-11-25 ACN 120.74 1.007678 0.007649