使用两列之间的映射在pandas数据框中创建链

时间:2018-09-18 15:16:39

标签: python pandas dataframe

这是一个测试数据框。我想利用EmpID和MgrID之间的关系在新列中进一步映射MgrID的管理器。

Test_df = pd.DataFrame({'EmpID':['1','2','3','4','5','6','7','8','9','10'], 
                    'MgrID':['4','4','4','6','8','8','10','10','10','12']})
Test_df

如果我为初始关系创建字典,那么我将能够创建链的第一个链接,但是我深信需要遍历每个新列来创建一个新的链接。

ID_Dict = {'1':'4',
           '2':'4',
           '3':'4',
           '4':'6',
           '5':'8',
           '6':'8',
           '7':'10',
           '8':'10',
           '9':'10',
          '10':'12'}
Test_df['MgrID_L2'] = Test_df['MgrID'].map(ID_Dict)
Test_df

最有效的方法是什么? 谢谢!

1 个答案:

答案 0 :(得分:1)

这是一种带有简单while循环的方法。请注意,我将MgrID的名称更改为MgrID_1

Test_df = pd.DataFrame({'EmpID':['1','2','3','4','5','6','7','8','9','10'], 
                        'MgrID_1':['4','4','4','6','8','8','10','10','10','12']})

d = Test_df.set_index('EmpID').MgrID_1.to_dict()

s = 2
while s:
    Test_df['MgrID_'+str(s)] =  Test_df['MgrID_'+str(s-1)].map(d)
    if Test_df['MgrID_'+str(s)].isnull().all():
        Test_df = Test_df.drop(columns='MgrID_'+str(s))
        s = 0
    else:
        s+=1

输入:Test_df

  EmpID MgrID_1 MgrID_2 MgrID_3 MgrID_4 MgrID_5
0     1       4       6       8      10      12
1     2       4       6       8      10      12
2     3       4       6       8      10      12
3     4       6       8      10      12     NaN
4     5       8      10      12     NaN     NaN
5     6       8      10      12     NaN     NaN
6     7      10      12     NaN     NaN     NaN
7     8      10      12     NaN     NaN     NaN
8     9      10      12     NaN     NaN     NaN
9    10      12     NaN     NaN     NaN     NaN