Question

我有一个数据框，我想对网址进行分组，并为网址创建新的列，其中url是相同的。

df2 = pd.DataFrame({'url':['/pool','/refrigerators','/refrigerators','/refrigerators','/joss-and-main','/furniture','/entertainment-centers-and-tv-stands'], 
'word':['pool','refrigerator','fridge','cooler','joss and main','furniture','tv stand']})

通缉输出：

Url word    word1   word2
/pool   pool        
/refrigerators  refrigerator    fridge  cooler
/joss-and-main  joss and main       
/furniture  furniture       
/entertainment-centers-and-tv-stands    tv stand

Answer 1

这是使用groupby的一种方法。

df2 = pd.DataFrame({'url':['/pool','/refrigerators','/refrigerators','/refrigerators','/joss-and-main','/furniture','/entertainment-centers-and-tv-stands'], 
                    'word':['pool','refrigerator','fridge','cooler','joss and main','furniture','tv stand']})

# groupby to list
s = df2.groupby('url')['word'].apply(list)
lst = s.values.tolist()

# pad inner lists so they have same length
maxlen = max(map(len, lst))
lst = [i+['']*(maxlen-len(i)) for i in lst]

# build split dataframe
res = pd.DataFrame(lst, columns=['word'+str(i) for i in range(1,maxlen+1)], index=s.index)\
        .reset_index()

#                                     url          word1   word2   word3
# 0  /entertainment-centers-and-tv-stands       tv stand                
# 1                            /furniture      furniture                
# 2                        /joss-and-main  joss and main                
# 3                                 /pool           pool                
# 4                        /refrigerators   refrigerator  fridge  cooler

Pandas从行条目创建列

1 个答案: