我生成了两个(数百个)df,然后连接起来,我希望在按原始顺序保持行具有相同列D
名称时对其进行排序:
In [120]: df_list[0]
Out[120]:
A B C D
0 0.564678 0.598355 0.606693 MA0835
1 0.066291 0.063587 0.662292 MA0835
2 0.000000 0.000000 0.010758 MA0835
3 0.000000 0.000000 0.097895 MA0835
4 0.000000 0.000000 0.136468 MA0835
In [121]: df_list[1]
Out[121]:
A B C D
0 0.628844 0.614492 0.570333 MA1002
1 0.317790 0.293189 0.239368 MA1002
2 0.000000 0.000000 0.000000 MA1002
3 0.000000 0.000000 0.000000 MA1002
4 0.000000 0.000000 0.000000 MA1002
In [122]: df = pd.concat(df_list[0:2])
In [122]: df
Out[122]:
A B C D
0 0.564678 0.598355 0.606693 MA0835
1 0.066291 0.063587 0.662292 MA0835
2 0.000000 0.000000 0.010758 MA0835
3 0.000000 0.000000 0.097895 MA0835
4 0.000000 0.000000 0.136468 MA0835
0 0.628844 0.614492 0.570333 MA1002
1 0.317790 0.293189 0.239368 MA1002
2 0.000000 0.000000 0.000000 MA1002
3 0.000000 0.000000 0.000000 MA1002
4 0.000000 0.000000 0.000000 MA1002
标准排序产生:
In [125]: df.sort_values('A',ascending=False)
Out[125]:
A B C D
0 0.628844 0.614492 0.570333 MA1002
0 0.564678 0.598355 0.606693 MA0835
1 0.317790 0.293189 0.239368 MA1002
1 0.066291 0.063587 0.662292 MA0835
2 0.000000 0.000000 0.010758 MA0835
3 0.000000 0.000000 0.097895 MA0835
4 0.000000 0.000000 0.136468 MA0835
2 0.000000 0.000000 0.000000 MA1002
3 0.000000 0.000000 0.000000 MA1002
4 0.000000 0.000000 0.000000 MA1002
但是,我想对A
进行排序并保留D
指定的行分组。这是所需的输出:
A B C D
0 0.628844 0.614492 0.570333 MA1002
1 0.317790 0.293189 0.239368 MA1002
2 0.000000 0.000000 0.000000 MA1002
3 0.000000 0.000000 0.000000 MA1002
4 0.000000 0.000000 0.000000 MA1002
0 0.564678 0.598355 0.606693 MA0835
1 0.066291 0.063587 0.662292 MA0835
2 0.000000 0.000000 0.010758 MA0835
3 0.000000 0.000000 0.097895 MA0835
4 0.000000 0.000000 0.136468 MA0835
我是否需要使用groupby
,还是有其他我不熟悉的排序/分组技术?