给出:
applications = pd.DataFrame({'application_id': [1, 2, 3, 4, 5],
'date': ['2015-01-05', '2015-01-06', '2015-01-07', '2015- 01-08', '2015-01-09'],
'client_employer': ['company A', 'company B', 'company C', 'company A', 'company B'],
'client_name': ['Bill', 'John', 'Steve', 'Bill', 'Alex']})
表格:
date client_employer client_name
0 2015-01-05 company A John
1 2015-01-06 company B Bill
2 2015-01-07 company B Bill
3 2015-01-08 company A Sarah
4 2015-01-09 company B Alex
5 2015-01-10 company B Brian
我们过去有多少位同一雇主的不同人?没有圈圈
所需的输出:
date client_employer client_name employers_count
0 2015-01-05 company A John 0
1 2015-01-06 company B Bill 0
2 2015-01-07 company B Bill 0
3 2015-01-08 company A Sarah 1
4 2015-01-09 company B Alex 1
5 2015-01-10 company B Brian 2
applications = pd.DataFrame({'application_id': [1, 2, 3, 4, 5, 6],
'date': ['2015-01-05', '2015-01-06', '2015-01-07', '2015-01-08', '2015-01-09', '2015-01-10'],
'client_employer': ['company B', 'company B', 'company B', 'company B', 'company B', 'company B'],
'client_name': ['Bill', 'John', 'Steve', 'Bill', 'Alex', 'Bill'],
'cnt_desired': [0, 1, 2, 2, 3, 3]})
emp_count = applications.groupby(['client_employer'])['client_name'].transform(lambda x: x.map(dict(zip(x.unique(),np.arange(len(x.unique()))))))
applications['cnt'] = emp_count
application_id date client_employer client_name cnt_desired cnt
0 1 2015-01-05 company B Bill 0 0
1 2 2015-01-06 company B John 1 1
2 3 2015-01-07 company B Steve 2 2
3 4 2015-01-08 company B Bill 2 0
4 5 2015-01-09 company B Alex 3 3
5 6 2015-01-10 company B Bill 3 0
答案 0 :(得分:2)
首先在groupby
上使用client_employer
,然后访问client_name
列,并使用基于map
个dict
唯一值创建的client_name
转换列作为键,并将range
个唯一值的数目作为值:
df['employers_count'] = df.groupby(['client_employer'])['client_name'].transform(lambda x: x.map(dict(zip(x.unique(),range(x.nunique())))))
date client_employer client_name employers_count
0 2015-01-05 company A John 0
1 2015-01-06 company B Bill 0
2 2015-01-07 company B Bill 0
3 2015-01-08 company A Sarah 1
4 2015-01-09 company B Alex 1
5 2015-01-10 company B Brian 2