根据列值确定优先级并选择行

时间:2019-05-13 10:54:07

标签: python pandas

我想为“多列”赋予优先级,并根据优先级选择行

我想在“类别”列中选择具有RC优先级的ID,在“状态”列中选择“待处理优先级”,并相应地选择“行”

示例:输入数据框

ID  Category      Status    Date
1   GC       Pending    01-03-2015
1   RC       Resolved   05-10-2016
1   GC       Resolved   06-03-2017
2   RC       Pending    09-08-2016
2   RC       Resolved   10-05-2014
3   GC       Resolved   10-08-2018
3   RC       Pending    13-05-2019
4   GC       Pending    10-06-2018
4   GC       Resolved   15-09-2014

输出数据框

ID  Category      Status    Date
1   RC       Resolved   05-10-2016
2   RC       Pending    09-08-2016
3   RC       Pending    13-05-2019
4   GC       Pending    10-06-2018

1 个答案:

答案 0 :(得分:2)

通过将列表传递给categories参数,然后按DataFrame.sort_values按3列排序,最后用DataFrame.drop_duplicates删除重复项,将列转换为具有优先级的有序分类。

df['Category'] = pd.Categorical(df['Category'], ordered=True, categories=['GC','RC'])
df['Status'] = pd.Categorical(df['Status'], ordered=True, categories=['Resolved','Pending'])

df = df.sort_values(['ID','Category','Status']).drop_duplicates('ID', keep='last')
print (df)
   ID Category    Status        Date
1   1       RC  Resolved  05-10-2016
3   2       RC   Pending  09-08-2016
6   3       RC   Pending  13-05-2019
7   4       GC   Pending  10-06-2018