我需要按一列对数据框进行排序,其中包括数字和字母的组合。
df = [{"user": "seth",
"name": "1"},
{"user" : "chris",
"name": "10A"},
{"user" : "aaron",
"name": "4B"},
{"user" : "dan",
"name": "10B"}]
我的代码:
df1 = df.sort_values(by=['name'])
这让我:
df1 = [{"user": "seth",
"name": "1"},
{"user" : "chris",
"name": "10A"},
{"user" : "dan",
"name": "10B"},
{"user" : "aaron",
"name": "4B"}]
我想要:
df1 = [{"user": "seth",
"name": "1"},
{"user" : "aaron",
"name": "4B"},
{"user" : "chris",
"name": "10A"},
{"user" : "dan",
"name": "10B"}]
编辑:
它被标记为类似的问题,其代码为:
DPRexitPoints.reindex(index=natsorted(DPRexitPoints.PageName))
它返回一个排序的数据帧,但是所有值都已被NaN取代。
答案 0 :(得分:1)
您可以执行np.argsort
和iloc
:
df.iloc[np.argsort(df['name'].str
.extract('^(\d*)')[0]
.astype(int))
]
输出:
user name
0 seth 1
2 aaron 4B
1 chris 10A
3 dan 10B