Question

我有一个数据框：

import pandas as pd
import numpy as np
ycap = [2015, 2016, 2017]

df = pd.DataFrame({'a': np.repeat(ycap, 5),
                    'b': np.random.randn(15)})

       a         b
0   2015  0.436967
1   2015 -0.539453
2   2015 -0.450282
3   2015  0.907723
4   2015 -2.279188
5   2016  1.468736
6   2016 -0.169522
7   2016  0.003501
8   2016  0.182321
9   2016  0.647310
10  2017  0.679443
11  2017 -0.154405
12  2017 -0.197271
13  2017 -0.153552
14  2017  0.518803

我想添加列c，如下所示：

     a         b     c
0   2015 -0.826946  2014
1   2015  0.275072  2013
2   2015  0.735353  2012
3   2015  1.391345  2011
4   2015  0.389524  2010
5   2016 -0.944750  2015
6   2016 -1.192546  2014
7   2016 -0.247521  2013
8   2016  0.521094  2012
9   2016  0.273950  2011
10  2017 -1.199278  2016
11  2017  0.839705  2015
12  2017  0.075951  2014
13  2017  0.663696  2013
14  2017  0.398995  2012

我尝试使用以下内容实现此目的，但是1需要在组内增加。我怎么能这样做？感谢

gp = df.groupby('a')
df['c'] = gp['a'].apply(lambda x: x-1)

Answer 1

按cumcount创建的a减去列Series，最后减去1：

df['c'] = df['a'] - df.groupby('a').cumcount() - 1
print (df)
       a         b     c
0   2015  0.285832  2014
1   2015 -0.223318  2013
2   2015  0.620920  2012
3   2015 -0.891164  2011
4   2015 -0.719840  2010
5   2016 -0.106774  2015
6   2016 -1.230357  2014
7   2016  0.747803  2013
8   2016 -0.002320  2012
9   2016  0.062715  2011
10  2017  0.805035  2016
11  2017 -0.385647  2015
12  2017 -0.457458  2014
13  2017 -1.589365  2013
14  2017  0.013825  2012

详情：

print (df.groupby('a').cumcount())
0     0
1     1
2     2
3     3
4     4
5     0
6     1
7     2
8     3
9     4
10    0
11    1
12    2
13    3
14    4
dtype: int64

Answer 2

你可以这样做：

In [8]: df['c'] = df.groupby('a')['a'].transform(lambda x: x-np.arange(1, len(x)+1))

In [9]: df
Out[9]:
       a         b     c
0   2015  0.436967  2014
1   2015 -0.539453  2013
2   2015 -0.450282  2012
3   2015  0.907723  2011
4   2015 -2.279188  2010
5   2016  1.468736  2015
6   2016 -0.169522  2014
7   2016  0.003501  2013
8   2016  0.182321  2012
9   2016  0.647310  2011
10  2017  0.679443  2016
11  2017 -0.154405  2015
12  2017 -0.197271  2014
13  2017 -0.153552  2013
14  2017  0.518803  2012

substby中的substract递增值

2 个答案: