当我使用多个列(['Symbol','Year','Month','Day']
)对DataFrame进行排序时,生成的DataFrame按Symbol > Year > Month
排序,不排序Day
:
In [1]: df = pd.DataFrame({'Symbol': {79: 'F', 81: 'F', 82: 'F', 83: 'F', 84: 'F', 85: 'F', 86: 'F', 87: 'F', 89: 'F'}, 'Shares': {79: 100, 81: 100, 82: 100, 83: 100, 84: 100, 85: 100, 86: 100, 87: 100, 89: 100}, 'Month': {79: '08', 81: '08', 82: '08', 83: '08', 84: '08', 85: '08', 86: '08', 87: '08', 89: '09'}, 'Year': {79: '2008', 81: '2008', 82: '2008', 83: '2008', 84: '2008', 85: '2008', 86: '2008', 87: '2008', 89: '2008'}, 'Action': {79: 'Sell', 81: 'Sell', 82: 'Buy', 83: 'Sell', 84: 'Buy', 85: 'Sell', 86: 'Buy', 87: 'Sell', 89: 'Sell'}, 'Day': {79: 2L, 81: 4L, 82: '06', 83: 11L, 84: '13', 85: 18L, 86: '18', 87: 23L, 89: 22L}})
In [2]: df
Out[2]:
Action Day Month Shares Symbol Year
79 Sell 2 08 100 F 2008
81 Sell 4 08 100 F 2008
82 Buy 06 08 100 F 2008
83 Sell 11 08 100 F 2008
84 Buy 13 08 100 F 2008
85 Sell 18 08 100 F 2008
86 Buy 18 08 100 F 2008
87 Sell 23 08 100 F 2008
89 Sell 22 09 100 F 2008
In [3]: df.sort(['Symbol','Year','Month','Day'])
Out[3]:
Action Day Month Shares Symbol Year
79 Sell 2 08 100 F 2008
81 Sell 4 08 100 F 2008
83 Sell 11 08 100 F 2008
85 Sell 18 08 100 F 2008
87 Sell 23 08 100 F 2008
82 Buy 06 08 100 F 2008
84 Buy 13 08 100 F 2008
86 Buy 18 08 100 F 2008
89 Sell 22 09 100 F 2008
为什么sort
没有按预期工作?
答案 0 :(得分:1)
它没有按预期工作,因为Days存储为混合类型(字符串和长整数),并且因为字符串在python 中“大于”数字(排序看起来像是意外的行为)
将此列转换为整数df['Day'] = df['Day'].apply(int)
我也会考虑在月份和年份这样做,因为在你的DataFrame中这些是字符串(并且可能更符合int):
df['Mo.'] = df['Mo.'].apply(int)
df['Year'] = df['Year'].apply(int)
然后你可以白天sort
:
In [11]: df.sort(['Day'])
Out[11]:
Indx Year Mo. Day Sym Action Shares
0 79 2008 8 2 F Sell 100
1 81 2008 8 4 F Sell 100
5 82 2008 8 6 F Buy 100
2 83 2008 8 11 F Sell 100
6 84 2008 8 13 F Buy 100
3 85 2008 8 18 F Sell 100
7 86 2008 8 18 F Buy 100
8 89 2008 9 22 F Sell 100
4 87 2008 8 23 F Sell 100
或者使用多列排序:
In [12]: df.sort(['Mo.', 'Day'])
Out[12]:
Indx Year Mo. Day Sym Action Shares
0 79 2008 8 2 F Sell 100
1 81 2008 8 4 F Sell 100
5 82 2008 8 6 F Buy 100
2 83 2008 8 11 F Sell 100
6 84 2008 8 13 F Buy 100
3 85 2008 8 18 F Sell 100
7 86 2008 8 18 F Buy 100
4 87 2008 8 23 F Sell 100
8 89 2008 9 22 F Sell 100
In [13]: df.sort(['Day', 'Mo.'])
Out[13]:
Indx Year Mo. Day Sym Action Shares
0 79 2008 8 2 F Sell 100
1 81 2008 8 4 F Sell 100
5 82 2008 8 6 F Buy 100
2 83 2008 8 11 F Sell 100
6 84 2008 8 13 F Buy 100
3 85 2008 8 18 F Sell 100
7 86 2008 8 18 F Buy 100
8 89 2008 9 22 F Sell 100
4 87 2008 8 23 F Sell 100
使用ascending
参数:
In [14]: df.sort(['Mo.', 'Day'], ascending=[True, False])
Out[14]:
Indx Year Mo. Day Sym Action Shares
4 87 2008 8 23 F Sell 100
3 85 2008 8 18 F Sell 100
7 86 2008 8 18 F Buy 100
6 84 2008 8 13 F Buy 100
2 83 2008 8 11 F Sell 100
5 82 2008 8 6 F Buy 100
1 81 2008 8 4 F Sell 100
0 79 2008 8 2 F Sell 100
8 89 2008 9 22 F Sell 100
...将按预期工作。