Question

假设我有一个DataFrame，其中包含多个行，这些行的不同短语用逗号分隔，如下所示：

>>> df = pd.DataFrame({'phrase':['dog, cat, cow','bird, cat','cow, bird','dog, cow','bird'],
})
>>> df
          phrase
0  dog, cat, cow
1      bird, cat
2      cow, bird
3       dog, cow
4           bird

我要进行排序，以使在phrase列中具有“ bird”的行位于第一行，如下所示：

          phrase
0      bird, cat
1      cow, bird
2           bird
3  dog, cat, cow
4       dog, cow

我该怎么做？预先感谢！

Answer 1

将sorted与自定义key一起使用

例如：

import pandas as pd

df = pd.DataFrame({'phrase':['dog, cat, cow','bird, cat','cow, bird','dog, cow','bird']})
df["New"] = pd.Series(sorted(df["phrase"].tolist(), key=lambda x: 0 if "bird" in x else 1))
print(df)

输出：

          phrase            New
0  dog, cat, cow      bird, cat
1      bird, cat      cow, bird
2      cow, bird           bird
3       dog, cow  dog, cat, cow
4           bird       dog, cow

Answer 2

添加一些“有鸟”列，对其进行排序，并在需要时将其删除。

(df.assign(has_bird=df.phrase.apply(lambda l: 'bird' in l))
    .sort_values(by='has_bird', ascending=False)
    .drop('has_bird', axis=1))
    phrase
1   bird, cat
2   cow, bird
4   bird
0   dog, cat, cow
3   dog, cow

您可以使用assign，sort_values和drop进行链接。

如果您使用的是旧版熊猫，请使用

df['has_bird'] = df.phrase.apply(lambda l: 'bird' in l))
df.sort_values(by='has_bird', ascending=False).drop('has_bird', axis=1)

Answer 3

您可以将Series.str.contains用作布尔掩码，反转条件并调用Series.argsort进行仓位，最后更改的顺序为DataFrame.iloc：

df = df.iloc[(~df['phrase'].str.contains('bird')).argsort()]
print (df)
          phrase
1      bird, cat
2      cow, bird
4           bird
0  dog, cat, cow
3       dog, cow

详细信息：

a = df['phrase'].str.contains('bird')
b = (~df['phrase'].str.contains('bird'))
c = (~df['phrase'].str.contains('bird')).argsort()
print (df.assign(contains=a, invert=b, argsort=c))
          phrase  contains  invert  argsort
0  dog, cat, cow     False    True        1
1      bird, cat      True   False        2
2      cow, bird      True   False        4
3       dog, cow     False    True        0
4           bird      True   False        3

Answer 4

只需根据条件创建一个带有布尔值的附加列，然后按该列排序。下面的代码应该可以工作。

import pandas as pd

df = pd.DataFrame({'phrase':['dog, cat, cow','bird, cat','cow, bird','dog, cow','bird']})
df['bird_exists'] = df['phrase'].apply(lambda x : 'bird' in x.lower())

df = df.sort_values('bird_exists', ascending=False)

print(df.head())

在Pandas DataFrame中首先对包含字符串的列进行排序

4 个答案: