我有一个csv文件
Row,employee_region,employee_branch,NAME,employee_email,employee_mobile_number,employee_nationality,employee_salary,employee_DOB
1 x a emp1 null 9986754352 cc 25000 10/8/1982
2 x b emp2 emp2@bb.com 9986754352 dc 55000 05/06/1987
3 y c emp3 emp3@bb.com 9886756352 hc 85000 01/01/1980
实际上记录数量超过1000
基于工资我添加了一个列位置
normal = data_df['salary'] <= 25000
experienced = (data_df['salary'] > 25000) & (data_df['salary'] <= 50000)
#vip = (data_df['salary'] > 50000) & (data_df['salary'] <= 50000)
data_df['position'] = np.where(normal, 'normal', np.where(experienced, 'experienced','manager'))
根据工资3位置是
normal -----salary<=25000
experienced----salary >25000 && salary <=50000
manager--------salary>50000
我想使用pandas
以下面提到的形式获得结果employee_region employee_branch count_employee_email count_employee_mobile count_employee_DOB %count_employee_email %count_employee_mobile %count_employee_dob count_normal_employee count_experienced_employee count_manager
x b 1 1 1 count(tuple)/totalcount(column) 0 1 0