import numpy as np
import pandas as pd
exam_data =pd.DataFrame( {'name': ['Anastasia', 'Dima', 'Katherine',
'James', 'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no',
'yes']})
exam_data.set_index([['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'],
'name'])
r1 = exam_data.replace('yes', 'true')
r2 = exam_data.replace('no', 'false')
r1
我希望结果是
attempts name qualify score
a 1 Anastasia true 12.5
b 3 Dima false 9.0
c 2 Katherine true 16.5
d 3 James false NaN
e 2 Emily false 9.0
f 3 Michael true 20.0
g 1 Matthew true 14.5
h 1 Laura false NaN
i 2 Kevin false 8.0
j 1 Jonas true 19.0
答案 0 :(得分:1)
最简单的是按yes
比较值:
exam_data['qualify'] = exam_data['qualify'] == 'yes'
print (exam_data)
attempts name qualify score
0 1 Anastasia True 12.5
1 3 Dima False 9.0
2 2 Katherine True 16.5
3 3 James False NaN
4 2 Emily False 9.0
5 3 Michael True 20.0
6 1 Matthew True 14.5
7 1 Laura False NaN
8 2 Kevin False 8.0
9 1 Jonas True 19.0
如果要使用replace
- 如果dict中定义的其他值未更改:
exam_data['qualify'] = exam_data['qualify'].replace({'yes':True, 'no':False})
或map
- 如果dict中定义的其他值已替换为NaN
s:
exam_data['qualify'] = exam_data['qualify'].map({'yes':True, 'no':False})