High school graduate 72370
Some college but no degree 41683
Bachelors degree(BA AB BS) 29541
Children 21116
7th and 8th grade 12073
10th grade 11314
11th grade 10367
Masters degree(MA MS MEng MEd MSW MBA) 9751
9th grade 9262
Associates degree-occup /vocational 8026
Associates degree-academic program 6434
5th or 6th grade 4986
12th grade no diploma 3258
1st 2nd 3rd or 4th grade 2697
Prof school degree (MD DDS DVM LLB JD) 2534
Doctorate degree(PhD EdD) 1826
Less than 1st grade 1228
这些都是列中的所有值及其计数...
我尝试了以下函数从值中删除方括号和其中的内容- 1)学士学位(BA AB BS) 2)硕士学位(硕士,硕士,硕士) 3)专业学位(MD DDS DVM LLB JD) 4)博士学位(博士学位)
这是我的功能-
def Clean_names(education_names):
if re.search('\(.*', education_names):
pos = re.search('\(.*', education_names).start()
return education_names[:pos]
else:
return education_names
运行此功能并将其应用于我的专栏后,我能够摆脱这些括号。 这是输出:
High school graduate 72370
Some college but no degree 41683
Bachelors degree 29541
Children 21116
7th and 8th grade 12073
10th grade 11314
11th grade 10367
Masters degree 9751
9th grade 9262
Associates degree-occup /vocational 8026
Associates degree-academic program 6434
5th or 6th grade 4986
12th grade no diploma 3258
1st 2nd 3rd or 4th grade 2697
Prof school degree 2534
Doctorate degree 1826
Less than 1st grade 1228
但是当我尝试根据这些值创建箱时,我遇到了问题...
这是代码:
dataout2.replace({
'High school graduate' : 'high-school-graduate',
'Some college but no degree' : 'high-school-graduate',
"Bachelors degree" : 'undergraduate',
'Children' : 'children',
'7th and 8th grade' : 'children',
'10th grade' : 'high-school',
'11th grade' : 'high-school',
'Masters degree' : 'postgraduate',
'9th grade' : 'high-school',
'Associates degree-occup /vocational' : 'undergraduate',
'Associates degree-academic program' : 'undergraduate',
'5th or 6th grade' : 'children',
'12th grade no diploma' : 'high-school',
'1st 2nd 3rd or 4th grade' : 'children',
'Prof school degree' : 'postgraduate',
'Doctorate degree' : 'postgraduate',
'Less than 1st grade' : 'children'},inplace = True , regex = True)
它给我的输出是这样的:
high-school-graduate 114053
undergraduate 44001
children 42100
high-school 34201
postgraduate 11577
postgraduate 2534
我不知道为什么我要获得两个研究生课程...有人可以告诉我我在哪里搞砸了吗?