我有一个数据框df
,其中有两列gender, score
。
|---------------------|------------------|
| gender | score |
|---------------------|------------------|
| male | 34 |
|---------------------|------------------|
| female | 34 |
|---------------------|------------------|
| male | 34 |
|---------------------|------------------|
| female | 34 |
|---------------------|------------------|
| male | 34 |
|---------------------|------------------|
我想将男性(gender == 'male'
)的得分从第3行更改为第5行,以达到预期输出:
|---------------------|------------------|
| gender | score |
|---------------------|------------------|
| male | 34 |
|---------------------|------------------|
| female | 34 |
|---------------------|------------------|
| male | 0 |
|---------------------|------------------|
| female | 34 |
|---------------------|------------------|
| male | 0 |
|---------------------|------------------|
如何将iloc
与该条件结合起来?
答案 0 :(得分:1)
您可以使用两个遮罩(条件)来完成此操作。这应该是可读且有意义的。
m1 = (df.gender == 'male')
m2 = (df.gender.duplicated())
df.loc[m1&m2, 'score'] = 0
切掉非零掩码的第一个真值(需要import numpy as np
)。这应该更快。
m = np.nonzero(df.gender=='male')[0][1:]
df.loc[m, 'score'] = 0
完整示例:
import pandas as pd
import numpy as np
df = pd.DataFrame({
'gender': ['male','female','male','female','male'],
'score': 34
})
m1 = (df.gender == 'male')
m2 = (df.gender.duplicated())
m = np.nonzero(df.gender=='male')[0][1:]
df.loc[m, 'score'] = 0
print(df)
返回:
gender score
0 male 34
1 female 34
2 male 0
3 female 34
4 male 0
答案 1 :(得分:0)
我认为您需要
m=df.loc[2:5,:].loc[df['gender']=='male']
df.loc[m.index,'score']=0
print(df)
gender score
0 male 34
1 female 34
2 male 0
3 female 34
4 male 0