这是我的数据集
id keyword
1 transfer atm transfer atm
2 transfer transfer atm
3 atm transfer hospital
我要的是按字母顺序对关键字进行排序,并使其唯一,这是基于字母顺序将keyword
,atm
和hospital
后的transfer
上的单词/ p>
id keyword
1 atm transfer
2 atm transfer
3 atm hospital transfer
答案 0 :(得分:6)
尝试一下:
df['keyword']=df['keyword'].apply(lambda x:' '.join(sorted(set(x.split()))))
O / P:
id keyword
0 1 atm transfer
1 2 atm transfer
2 3 atm hospital transfer
说明:
答案 1 :(得分:5)
想法是按空格分割值,转换为集合,对空格进行排序和合并:
df['keyword'] = [' '.join(sorted(set(x.split()))) for x in df['keyword']]
#apply alternative
#df['keyword'] = df['keyword'].apply(lambda x: ' '.join(sorted(set(x.split()))))
print (df)
id keyword
0 1 atm transfer
1 2 atm transfer
2 3 atm hospital transfer