我想延长扩展名,使bale cale
与cale bale
的含义相同,所有关键字都在字符串上
这是我的数据集
Keyword Category_1 Category_2 Category_3
ale bale cale bale cale cale
bale cale cale cale ale
这就是我想要的
Keyword Category_1 Category_2 Category_3
ale bale cale bale cale cale
ale cale bale bale cale cale
bale ale cale bale cale cale
bale cale ale bale cale cale
cale ale bale bale cale cale
cale bale ale bale cale cale
bale cale cale cale ale
cale bale cale cale ale
答案 0 :(得分:2)
将itertools.permutations
与拆分值和列表列表理解一起使用,然后按空格将值连接在一起,并将索引值添加到助手DataFrame
-df1
中。最后join
个原始DataFrame:
from itertools import permutations
L = [(' '.join(y), k) for k, v in df['Keyword'].items() for y in permutations(v.split())]
df1 = pd.DataFrame(L, columns=['Keyword','idx']).set_index('idx')
print (df1)
Keyword
idx
0 ale bale cale
0 ale cale bale
0 bale ale cale
0 bale cale ale
0 cale ale bale
0 cale bale ale
1 bale cale
1 cale bale
df1
的另一种解决方案:
vals, idx = list(zip(*L))
df1 = pd.DataFrame({'Keyword':vals}, index=idx).rename_axis('idx')
df = df1.join(df.drop('Keyword',axis=1), on='idx').reset_index(drop=True)
print (df)
Keyword Category_1 Category_2 Category_3
0 ale bale cale bale cale cale
1 ale cale bale bale cale cale
2 bale ale cale bale cale cale
3 bale cale ale bale cale cale
4 cale ale bale bale cale cale
5 cale bale ale bale cale cale
6 bale cale cale cale ale
7 cale bale cale cale ale