我有这个数据框:
Playlist Track Name Spotify Uri Playlist Uri
microhouse make a move 5nUS4bSN0cFZB0knxyM4LZ 1d4gyZxan7lK9KqYU2EJ
microhouse mango 2f8eSlsreAHHzJ5SPkpYLf 1d4gyZxan7lK9KqYU2EJ
attlas ryat 3McvalY1RDYczyDmixyAwQ 2CInjKguWauO29QB21Co
attlas further 4qEUN1lON8UjnUiOZc39ID 2CInjKguWauO29QB21Co
我希望它看起来像这样:
Playlist microhouse attlas
Playlist Uri 1d4gyZxan7lK9KqY 2CInjKguWauO29Q
Track Name Spotify Uri Track Name Spotify Uri
make a move 5nUS4bSN0cFZB0kn ryat 3valY1RDYc
mango 2f8eSlsreAHHzJ5S further 4qEUN1lON
我已经使用数据透视为每个播放列表和该播放列表中的所有曲目名称生成一列,但是我不知道如何使用多重索引(播放列表和播放列表URI),没有聚合以及两个值列(跟踪名称和Spotify URI)。 Stack并没有真正做到我想要的。感谢任何帮助。
答案 0 :(得分:2)
您可以在cumcount
的列中为新index
创建3级MultiIndex,并在set_index
的unstack
中创建三级MultiIndex,必要时最后按sort_index
排序第二级,通过reorder_levels
进行更改级别排序,还可以通过reindex
进行更改排序:
g = df.groupby(['Playlist','Playlist Uri']).cumcount()
df = (df.set_index([g, 'Playlist','Playlist Uri'])
.unstack([1,2])
.sort_index(axis=1, level=1)
.reorder_levels([1,2,0], axis=1)
.reindex(['Track Name','Spotify Uri'], axis=1, level=2))
print (df)
Playlist attlas \
Playlist Uri 2CInjKguWauO29QB21Co
Track Name Spotify Uri
0 ryat 3McvalY1RDYczyDmixyAwQ
1 further 4qEUN1lON8UjnUiOZc39ID
Playlist microhouse
Playlist Uri 1d4gyZxan7lK9KqYU2EJ
Track Name Spotify Uri
0 make a move 5nUS4bSN0cFZB0knxyM4LZ
1 mango 2f8eSlsreAHHzJ5SPkpYLf
print (df.columns)
MultiIndex(levels=[['attlas', 'microhouse'],
['1d4gyZxan7lK9KqYU2EJ', '2CInjKguWauO29QB21Co'],
['Track Name', 'Spotify Uri']],
labels=[[0, 0, 1, 1], [1, 1, 0, 0], [0, 1, 0, 1]],
names=['Playlist', 'Playlist Uri', None])