在熊猫中,有没有一种方法可以根据数据范围分配值?

时间:2020-03-12 16:36:17

标签: python pandas

我将尽力描述我的挑战。我有一张这样的桌子:

ID  |  PHASE  |  code    |      START       |  END 
A      SSH        SLP           01/02/2020  | 02/02/2020
A      SSH        DRL           02/02/2020  | 02/20/2020
A      SSH        TYU           01/03/2020  | 01/30/2020
A      SSH        TYT           01/04/2020  | 01/30/2020
A      MNH        TOH           01/15/2020  | 02/01/2020
A      MNH        DRL           02/05/2020  | 03/01/2020
A      MNH        TYU           01/16/2020  | 02/02/2020
A      MNH        TYT           01/19/2020  | 02/01/2020
A      SRF        RIC           02/10/2020  | 03/01/2020
A      SRF        DRL           02/19/2020  | 03/10/2020
A      SRF        TYU           02/13/2020  | 03/05/2020
A      SRF        TYT           02/11/2020  | 03/01/2020
(to E)

ID元素以完全相同的PHASE一直到达E,要求我获取“代码”不等于“ DRL”的每个阶段的最小日期,我做到了这一点:

phase_value = {'SSH': (1,2,3),
           'SRF': (4,5,6),
           'INT': (7,8,9),
           'MNH': (13,14,15)}

 for key in phase_value.keys():

        merge_table.loc[merge_table[ ((merge_table['Phase']==key) & (merge_table['Actual Operation']!='DRL'))].groupby('WELL_ID')['Start'].idxmin(), "test"] = phase_value[key][0]
        merge_table.loc[merge_table[ ((merge_table['Phase']==key) & (merge_table['Actual Operation']=='DRL'))].groupby('WELL_ID')['Start'].idxmin(), "test"] = phase_value[key][1]
        merge_table.loc[merge_table[ ((merge_table['Phase']==key) & (merge_table['Actual Operation']=='DRL'))].groupby('WELL_ID')['End'].idxmax(), "test"] = phase_value[key][2]

因此,对于dic中的每个值,请找到最小值,并根据阶段代码分配一个值,对于每个ID,我确定还有更好的方法,但这对我有用。

ID  |  PHASE  |  code    |      START       |  END         |   test
A      SSH        SLP           01/02/2020  | 02/02/2020        1
A      SSH        DRL           02/02/2020  | 02/20/2020        2
A      SSH        TYU           01/03/2020  | 01/30/2020
A      SSH        TYT           01/04/2020  | 01/30/2020
A      MNH        TOH           01/15/2020  | 02/01/2020        13
A      MNH        DRL           02/05/2020  | 03/01/2020        14
A      MNH        TYU           01/16/2020  | 02/02/2020
A      MNH        TYT           01/19/2020  | 02/01/2020
A      SRF        RIC           02/10/2020  | 03/01/2020        4
A      SRF        DRL           02/19/2020  | 03/10/2020        5
A      SRF        TYU           02/13/2020  | 03/05/2020
A      SRF        TYT           02/11/2020  | 03/01/2020
(to E)

如果我想为例如阶段SSH添加数字,如果开始日期>大于最小值但小于值2的数字,则此数字为PHASE =='SSH'和“ code” ==“ DRL”每个ID和每个PHASE。

真的希望我对此表示清楚。预先感谢。

预期:

ID  |  PHASE  |  code    |      START       |  END         |   test
A      SSH        SLP           01/02/2020  | 02/02/2020        1
A      SSH        DRL           02/02/2020  | 02/20/2020        2
A      SSH        TYU           01/03/2020  | 01/30/2020        1
A      SSH        TYT           01/04/2020  | 01/30/2020        1
A      MNH        TOH           01/15/2020  | 02/01/2020        13
A      MNH        DRL           02/05/2020  | 03/01/2020        14
A      MNH        TYU           01/16/2020  | 02/02/2020
A      MNH        TYT           01/19/2020  | 02/01/2020
A      SRF        RIC           02/10/2020  | 03/01/2020        4
A      SRF        DRL           02/19/2020  | 03/10/2020        5
A      SRF        TYU           02/13/2020  | 03/05/2020
A      SRF        TYT           02/11/2020  | 03/01/2020
(to E)

0 个答案:

没有答案