我无法理解如何做到这一点,但我想从这个DataFrame中走出来:
Date Value
Jan-15 300
Feb-15 302
Mar-15 303
Apr-15 305
May-15 307
Jun-15 307
Jul-15 305
Aug-15 306
Sep-15 308
Oct-15 310
Nov-15 309
Dec-15 312
Jan-16 315
Feb-16 317
Mar-16 315
Apr-16 315
May-16 312
Jun-16 314
Jul-16 312
Aug-16 313
Sep-16 316
Oct-16 316
Nov-16 316
Dec-16 312
通过计算月度和年度变化来计算:
Date Value otm oty
Jan-15 300 na na
Feb-15 302 2 na
Mar-15 303 1 na
Apr-15 305 2 na
May-15 307 2 na
Jun-15 307 0 na
Jul-15 305 -2 na
Aug-15 306 1 na
Sep-15 308 2 na
Oct-15 310 2 na
Nov-15 309 -1 na
Dec-15 312 3 na
Jan-16 315 3 15
Feb-16 317 2 15
Mar-16 315 -2 12
Apr-16 315 0 10
May-16 312 -3 5
Jun-16 314 2 7
Jul-16 312 -2 7
Aug-16 313 1 7
Sep-16 316 3 8
Oct-16 316 0 6
Nov-16 316 0 7
Dec-16 312 -4 0
所以otm是从上面的字段的值计算的,而oty是从上面的12个字段计算的。
答案 0 :(得分:4)
我认为你需要diff
,但是必须在索引中没有错过任何月份:
df['otm'] = df.Value.diff()
df['oty'] = df.Value.diff(12)
print (df)
Date Value otm oty
0 Jan-15 300 NaN NaN
1 Feb-15 302 2.0 NaN
2 Mar-15 303 1.0 NaN
3 Apr-15 305 2.0 NaN
4 May-15 307 2.0 NaN
5 Jun-15 307 0.0 NaN
6 Jul-15 305 -2.0 NaN
7 Aug-15 306 1.0 NaN
8 Sep-15 308 2.0 NaN
9 Oct-15 310 2.0 NaN
10 Nov-15 309 -1.0 NaN
11 Dec-15 312 3.0 NaN
12 Jan-16 315 3.0 15.0
13 Feb-16 317 2.0 15.0
14 Mar-16 315 -2.0 12.0
15 Apr-16 315 0.0 10.0
16 May-16 312 -3.0 5.0
17 Jun-16 314 2.0 7.0
18 Jul-16 312 -2.0 7.0
19 Aug-16 313 1.0 7.0
20 Sep-16 316 3.0 8.0
21 Oct-16 316 0.0 6.0
22 Nov-16 316 0.0 7.0
23 Dec-16 312 -4.0 0.0
如果缺少某些数据,则有点复杂:
to_datetime
+ to_period
set_index
+ reindex
- 如果首先缺少Jan
或最后Dec
个值,则需要手动设置,而不是min
和最大strftime
reset_index
df['Date'] = pd.to_datetime(df['Date'], format='%b-%y').dt.to_period('M')
df = df.set_index('Date')
df = df.reindex(pd.period_range(df.index.min(), df.index.max(), freq='M'))
df.index = df.index.strftime('%b-%y')
df = df.rename_axis('date').reset_index()
df['otm'] = df.Value.diff()
df['oty'] = df.Value.diff(12)
print (df)
date Value otm oty
0 Jan-15 300.0 NaN NaN
1 Feb-15 302.0 2.0 NaN
2 Mar-15 NaN NaN NaN
3 Apr-15 NaN NaN NaN
4 May-15 307.0 NaN NaN
5 Jun-15 307.0 0.0 NaN
6 Jul-15 305.0 -2.0 NaN
7 Aug-15 306.0 1.0 NaN
8 Sep-15 308.0 2.0 NaN
9 Oct-15 310.0 2.0 NaN
10 Nov-15 309.0 -1.0 NaN
11 Dec-15 312.0 3.0 NaN
12 Jan-16 315.0 3.0 15.0
13 Feb-16 317.0 2.0 15.0
14 Mar-16 315.0 -2.0 NaN
15 Apr-16 315.0 0.0 NaN
16 May-16 312.0 -3.0 5.0
17 Jun-16 314.0 2.0 7.0
18 Jul-16 312.0 -2.0 7.0
19 Aug-16 313.0 1.0 7.0
20 Sep-16 316.0 3.0 8.0
21 Oct-16 316.0 0.0 6.0
22 Nov-16 316.0 0.0 7.0
23 Dec-16 312.0 -4.0 0.0
答案 1 :(得分:0)
df['otm'] = df['Value'] - df['Value'].shift(1)
df['oty'] = df['Value'] - df['Value'].shift(12)
答案 2 :(得分:0)
更正确的解决方案是按月换班:
#Create datetime column
df['DateTime'] = pd.to_datetime(df['Date'], format='%b-%y')
#Set it as index
df.set_index('DateTime', inplace=True)
#Then shift by month frequency:
df['otm'] = df['Value'] - df['Value'].shift(1, freq='MS')
df['oty'] = df['Value'] - df['Value'].shift(12, freq='MS')