基于最近值的新DataFrame列

时间:2016-01-04 21:38:03

标签: python pandas

我的DataFrame中的每一行都有一个日期,值和类别变量。我想创建一个包含同一类别的先前值的新列。例如,

import numpy as np
import pandas as pd

df = pd.DataFrame({'Catagory':[3,1,2,1,2,3], 'Value':[10,20,30,40,50,60]})

df['Date'] = pd.date_range('2015-1-1', '2015-1-6')


>>> df
   Catagory  Value       Date
0         3     10 2015-01-01
1         1     20 2015-01-02
2         2     30 2015-01-03
3         1     40 2015-01-04
4         2     50 2015-01-05
5         3     60 2015-01-06

我希望新列看起来像这样:

>>> df
   Catagory  Value       Date  PreviousValue
0         3     10 2015-01-01            NaN
1         1     20 2015-01-02            NaN
2         2     30 2015-01-03            NaN
3         1     40 2015-01-04             20
4         2     50 2015-01-05             30
5         3     60 2015-01-06             10

1 个答案:

答案 0 :(得分:1)

df['Previous Value'] = df.groupby('Category')['Value'].shift()

   Category  Value       Date  Previous Value
0         3     10 2015-01-01             NaN
1         1     20 2015-01-02             NaN
2         2     30 2015-01-03             NaN
3         1     40 2015-01-04              20
4         2     50 2015-01-05              30
5         3     60 2015-01-06              10