dataset['Age'].value_counts()
output i get:36-55-MATURED VOTER 492
24-35-YOUNG VOTER 341
56 Above-EXPERIENCED VOTER 182
18-23-NEW VOTER 77
70 5
60 5
68 4
65 3
63 3
62 3
75 3
72 1
69 1
67 1
80 1
73 1
66 1
Name: Age, dtype: int64'
此处dataset['Age']
是数据框数据集的一列。我正在尝试在其中创建一个新列。
名为dataset['Age in Num']
的同一数据框,其中所有原始值都将被分组为
#just 4 categories
18-23-NEW VOTER as 18
24-35-YOUNG VOTER as 24
36-55-MATURED VOTER as 36
the remaining values as 56
我使用了以下代码,但是没有用...
for Age in dataset['Age']:
if Age == '24-35-YOUNG VOTER':
dataset['Age in Num'] = 24
elif Age == '36-55-MATURED VOTER':
dataset['Age in Num'] = 36
elif Age == '18-23-NEW VOTER':
dataset['Age in Num'] = 18
else:
dataset['Age in Num'] = 56
#then when i typed dataset['Age in Num']
#i got this
0 36
1 36
2 36
3 36
4 36
5 36
6 36
7 36
8 36
9 36
10 36
11 36
12 36
13 36
14 36
所有值均为36。...谢谢您的帮助
答案 0 :(得分:0)
这就是您要寻找的:
def test(Age):
if Age == '24-35-YOUNG VOTER':
return 24
elif Age == '36-55-MATURED VOTER':
return 36
elif Age == '18-23-NEW VOTER':
return 18
else:
return 56
df['Age in Num']=df.Age.apply(lambda x: test(x))
输出(示例):
Age Age in Num
0 1 56
1 32 56
2 24-35-YOUNG VOTER 24
3 24-35-YOUNG VOTER 24
4 36-55-MATURED VOTER 36
5 18-23-NEW VOTER 18