Question

让我们以python数据框为例。

ID年龄Bp

1 22 1

1 22 0

1 22 1

2 21 0

2 21 1

2 21 0

在上面的代码中，应排除BP列的最后n个序列（让n等于2），并按ID分组，其余BP则应更改为0。不起作用。

应该看起来像这样。

ID年龄BP

1 22 0

1 22 1

2 21 0

2 21 1

2 21 0

Answer 1

将cumcount与ascending=False一起用于每个组后面的计数器，并为numpy.where分配0：

n = 2
mask = df.groupby('ID').cumcount(ascending=False) < n
df['Bp'] = np.where(mask, df['Bp'], 0)

替代品：

df.loc[~mask, 'Bp'] = 0
df['Bp'] = df['Bp'].where(mask, 0)

print (df)
   ID  Age  Bp
0   1   22   0
1   1   22   0
2   1   22   0
3   1   22   1
4   2   21   0
5   2   21   1
6   2   21   0

详细信息：

print (df.groupby('ID').cumcount(ascending=False))
0    3
1    2
2    1
3    0
4    2
5    1
6    0
dtype: int64

print (mask)
0    False
1    False
2     True
3     True
4    False
5     True
6     True
dtype: bool

更改熊猫数据框中的列值（不包括分组依据的尾部）

1 个答案: