给出以下数据:
bit val
0 bit_3 37.7
1 bit_16 36.7
2 bit_6 40.6
3 bit_10 48.4
4 bit_2 50.5
5 bit_14 40.8
6 bit_4 52.0
7 bit_17 50.8
8 bit_7 37.8
9 bit_1 49.6
10 bit_13 46.7
11 bit_0 40.9
12 bit_19 41.3
13 bit_18 41.6
14 bit_9 51.1
15 bit_15 41.1
16 bit_8 39.2
17 bit_12 51.7
18 bit_11 49.8
19 bit_5 55.1
其外观为:
bit
我想根据尾随数字按sorted(df["bit"].to_list(), key=lambda x: int(x.split("_")[-1]))
列对数据进行排序。
如果这是标准的python列表,则可以执行以下操作:
{{1}}
我不确定如何将其应用于数据框。
答案 0 :(得分:2)
尝试使用natsort
from natsort import index_natsorted
df = df.iloc[index_natsorted(df.bit)]
df
Out[195]:
bit val
11 bit_0 40.9
9 bit_1 49.6
4 bit_2 50.5
0 bit_3 37.7
6 bit_4 52.0
19 bit_5 55.1
2 bit_6 40.6
8 bit_7 37.8
16 bit_8 39.2
14 bit_9 51.1
3 bit_10 48.4
18 bit_11 49.8
17 bit_12 51.7
10 bit_13 46.7
5 bit_14 40.8
15 bit_15 41.1
1 bit_16 36.7
7 bit_17 50.8
13 bit_18 41.6
12 bit_19 41.3
答案 1 :(得分:1)
使用df.sort_values
和.str.split("_",expand=True)
并使用.astype(int)
强制转换为整数:
df.sort_values('bit',key=lambda x: x.str.split("_",expand=True)[1].astype(int))
输出:
bit val
11 bit_0 40.9
9 bit_1 49.6
4 bit_2 50.5
0 bit_3 37.7
6 bit_4 52.0
19 bit_5 55.1
2 bit_6 40.6
8 bit_7 37.8
16 bit_8 39.2
14 bit_9 51.1
3 bit_10 48.4
18 bit_11 49.8
17 bit_12 51.7
10 bit_13 46.7
5 bit_14 40.8
15 bit_15 41.1
1 bit_16 36.7
7 bit_17 50.8
13 bit_18 41.6
12 bit_19 41.3
如果您需要重置索引,只需添加.reset_index(drop=True)
:
df.sort_values('bit',key=lambda x: x.str.split("_",expand=True)[1].astype(int)).reset_index(drop=True)
输出:
bit val
0 bit_0 40.9
1 bit_1 49.6
2 bit_2 50.5
3 bit_3 37.7
4 bit_4 52.0
5 bit_5 55.1
6 bit_6 40.6
7 bit_7 37.8
8 bit_8 39.2
9 bit_9 51.1
10 bit_10 48.4
11 bit_11 49.8
12 bit_12 51.7
13 bit_13 46.7
14 bit_14 40.8
15 bit_15 41.1
16 bit_16 36.7
17 bit_17 50.8
18 bit_18 41.6
19 bit_19 41.3
答案 2 :(得分:1)
使用 pandas> = 1.1.0 ,您可以像在sorted中一样使用key
。
在我的解决方案中,我对bit列进行排序,但是对于排序,我抛出了bit_
:
df.sort_values(
by='bit',
key=lambda x: x.str.replace('bit_', '').astype(int),
)
bit val
11 bit_0 40.9
9 bit_1 49.6
4 bit_2 50.5
0 bit_3 37.7
6 bit_4 52.0
.sort_values()
上的文档:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sort_values.html
答案 3 :(得分:0)
一种高效的方法是创建一个按您希望的方式排序的系列,然后将该索引传递给数据框:
# create series of bit integers, sort them
bit_vals = df.bit.str.split("_", expand=True).loc[:, 1].astype(int)
sort_series = bit_vals.sort_values()
# pass back to dataframe
df = df.iloc[sort_series.index]
结果:
bit val
11 bit_0 40.9
9 bit_1 49.6
4 bit_2 50.5
0 bit_3 37.7
6 bit_4 52.0
19 bit_5 55.1
2 bit_6 40.6
8 bit_7 37.8
16 bit_8 39.2
14 bit_9 51.1
3 bit_10 48.4
18 bit_11 49.8
17 bit_12 51.7
10 bit_13 46.7
5 bit_14 40.8
15 bit_15 41.1
1 bit_16 36.7
7 bit_17 50.8
13 bit_18 41.6
12 bit_19 41.3
您可以根据需要重置数据框索引
答案 4 :(得分:0)
您可以将str.extract
与Series.argsort
和df.loc
结合使用:
In [1038]: ix = df.bit.str.extract('(\d+)', expand=False).astype(int).argsort().tolist()
In [1039]: df.loc[ix]
Out[1039]:
bit val
11 bit_0 40.9
9 bit_1 49.6
4 bit_2 50.5
0 bit_3 37.7
6 bit_4 52.0
19 bit_5 55.1
2 bit_6 40.6
8 bit_7 37.8
16 bit_8 39.2
14 bit_9 51.1
3 bit_10 48.4
18 bit_11 49.8
17 bit_12 51.7
10 bit_13 46.7
5 bit_14 40.8
15 bit_15 41.1
1 bit_16 36.7
7 bit_17 50.8
13 bit_18 41.6
12 bit_19 41.3