我想按id选择最后一个位置并检查变量fecha
是否大于252,以便在np.where中使用它? / p>
id clae6 year quarter fecha fecha_dif2 position
1 475230.0 2007 1 220 -1 1
1 475230.0 2007 2 221 -1 2
1 475230.0 2007 3 222 -1 3
1 475230.0 2007 4 223 -1 4
1 475230.0 2008 1 224 -1 5
2 475230.0 2007 1 220 -1 1
2 475230.0 2007 2 221 -1 2
2 475230.0 2007 3 222 -1 3
2 475230.0 2007 4 223 -1 4
2 475230.0 2008 1 224 -1 5
3 475230.0 2010 1 232 -1 1
3 475230.0 2010 2 233 -1 2
3 475230.0 2010 3 234 -1 3
3 475230.0 2010 4 235 -1 4
3 475230.0 2011 1 236 -1 5
3 475230.0 2011 2 237 -1 6
答案 0 :(得分:2)
没有groupby
df.drop_duplicates(['id'],keep='last').fecha.gt(252)
Out[213]:
4 False
9 False
15 False
Name: fecha, dtype: bool
df['fechatest']=df.drop_duplicates(['id'],keep='last').fecha.gt(252)
df.fillna(False)
Out[216]:
id clae6 year quarter fecha fecha_dif2 position fechatest
0 1 475230.0 2007 1 220 -1 1 False
1 1 475230.0 2007 2 221 -1 2 False
2 1 475230.0 2007 3 222 -1 3 False
3 1 475230.0 2007 4 223 -1 4 False
4 1 475230.0 2008 1 224 -1 5 False
5 2 475230.0 2007 1 220 -1 1 False
6 2 475230.0 2007 2 221 -1 2 False
7 2 475230.0 2007 3 222 -1 3 False
8 2 475230.0 2007 4 223 -1 4 False
9 2 475230.0 2008 1 224 -1 5 False
10 3 475230.0 2010 1 232 -1 1 False
11 3 475230.0 2010 2 233 -1 2 False
12 3 475230.0 2010 3 234 -1 3 False
13 3 475230.0 2010 4 235 -1 4 False
14 3 475230.0 2011 1 236 -1 5 False
15 3 475230.0 2011 2 237 -1 6 False
答案 1 :(得分:0)
mask = df.groupby('id')['fecha'].tail(1) > 252
#same as
#mask = df.groupby('id')['fecha'].tail(1).gt(252)
print (mask)
4 False
9 False
15 False
Name: fecha, dtype: bool
如果需要与df
相同尺寸的面具添加reindex
:
m = df.groupby('id')['fecha'].tail(1).gt(252).reindex(df.index, fill_value=False)
print (m)
0 False
1 False
2 False
3 False
4 False
5 False
6 False
7 False
8 False
9 False
10 False
11 False
12 False
13 False
14 False
15 False
Name: fecha, dtype: bool
df['new'] = np.where(m, 'yes', 'no')
print (df)
id clae6 year quarter fecha fecha_dif2 position new
0 1 475230.0 2007 1 220 -1 1 no
1 1 475230.0 2007 2 221 -1 2 no
2 1 475230.0 2007 3 222 -1 3 no
3 1 475230.0 2007 4 223 -1 4 no
4 1 475230.0 2008 1 224 -1 5 no
5 2 475230.0 2007 1 220 -1 1 no
6 2 475230.0 2007 2 221 -1 2 no
7 2 475230.0 2007 3 222 -1 3 no
8 2 475230.0 2007 4 223 -1 4 no
9 2 475230.0 2008 1 224 -1 5 no
10 3 475230.0 2010 1 232 -1 1 no
11 3 475230.0 2010 2 233 -1 2 no
12 3 475230.0 2010 3 234 -1 3 no
13 3 475230.0 2010 4 235 -1 4 no
14 3 475230.0 2011 1 236 -1 5 no
15 3 475230.0 2011 2 237 -1 6 no