我想通过列"市场"来订购这个,但是正常" sort_values"来自熊猫不起作用。我想通过套装和游戏订购。你能帮忙吗?
+---+------------------------------+----------+----------+
| | market | stake | profit |
+---+------------------------------+----------+----------+
| 0 | Game Winner (Set 1, Game 1) | 1605.50 | -1020.30 |
| 1 | Game Winner (Set 1, Game 10) | 2825.00 | 85.42 |
| 2 | Game Winner (Set 1, Game 11) | 700.00 | 100.00 |
| 3 | Game Winner (Set 1, Game 12) | 2280.40 | 9.60 |
| 4 | Game Winner (Set 1, Game 2) | 5688.30 | -1516.84 |
| 5 | Game Winner (Set 1, Game 3) | 2604.00 | -1205.70 |
| 6 | Game Winner (Set 1, Game 4) | 4638.56 | -1817.72 |
| 7 | Game Winner (Set 1, Game 5) | 3600.00 | 1488.00 |
| 8 | Game Winner (Set 1, Game 6) | 8851.72 | -2776.65 |
| 9 | Game Winner (Set 1, Game 7) | 10477.00 | -2097.00 |
+---+------------------------------+----------+----------+
这是我的df。我使用的代码如下:
test = df.groupby("market")[["stake","profit"]].sum().reset_index().sort_values("market")
答案 0 :(得分:1)
排序工作正常。您的设置和游戏合并为一个字符串。字符串按整理顺序排序,而不是按明显的数值排序。如果要按这些数字对它们进行排序,则必须将数字部分分解,将它们转换为整数,然后对那些值进行排序,而不是对连接的字符串值进行排序。
这足以让你感动吗?
答案 1 :(得分:1)
使用natsort
from natsort import natsorted, ns
l=df.market.tolist()
df=df.set_index('market').loc[natsorted(l)].reset_index()
df
Out[130]:
market stake profit
0 Game Winner (Set 1, Game 1) 1605.50 -1020.30
1 Game Winner (Set 1, Game 2) 5688.30 -1516.84
2 Game Winner (Set 1, Game 3) 2604.00 -1205.70
3 Game Winner (Set 1, Game 4) 4638.56 -1817.72
4 Game Winner (Set 1, Game 5) 3600.00 1488.00
5 Game Winner (Set 1, Game 6) 8851.72 -2776.65
6 Game Winner (Set 1, Game 7) 10477.00 -2097.00
7 Game Winner (Set 1, Game 10) 2825.00 85.42
8 Game Winner (Set 1, Game 11) 700.00 100.00
9 Game Winner (Set 1, Game 12) 2280.40 9.60
答案 2 :(得分:0)
您可以根据提取的数字尝试重新索引:
idx = (df.market.str
.extract(r'(?P<set>\d+),\s*Game\s*(?P<game>\d+)', expand=True)
.astype(int)
.sort_values(['set','game'])
.index)
df.reindex(idx)