我有一个数据框,我想对网址进行分组,并为网址创建新的列,其中url是相同的。
df2 = pd.DataFrame({'url':['/pool','/refrigerators','/refrigerators','/refrigerators','/joss-and-main','/furniture','/entertainment-centers-and-tv-stands'],
'word':['pool','refrigerator','fridge','cooler','joss and main','furniture','tv stand']})
通缉输出:
Url word word1 word2
/pool pool
/refrigerators refrigerator fridge cooler
/joss-and-main joss and main
/furniture furniture
/entertainment-centers-and-tv-stands tv stand
答案 0 :(得分:0)
这是使用groupby
的一种方法。
df2 = pd.DataFrame({'url':['/pool','/refrigerators','/refrigerators','/refrigerators','/joss-and-main','/furniture','/entertainment-centers-and-tv-stands'],
'word':['pool','refrigerator','fridge','cooler','joss and main','furniture','tv stand']})
# groupby to list
s = df2.groupby('url')['word'].apply(list)
lst = s.values.tolist()
# pad inner lists so they have same length
maxlen = max(map(len, lst))
lst = [i+['']*(maxlen-len(i)) for i in lst]
# build split dataframe
res = pd.DataFrame(lst, columns=['word'+str(i) for i in range(1,maxlen+1)], index=s.index)\
.reset_index()
# url word1 word2 word3
# 0 /entertainment-centers-and-tv-stands tv stand
# 1 /furniture furniture
# 2 /joss-and-main joss and main
# 3 /pool pool
# 4 /refrigerators refrigerator fridge cooler