我的数据如下:
2 PresentationID 12954
5 Attendees 65
6 Downloads 0
7 Questions 0
8 Likes 11
9 Tweets 0
10 Polls 0
73 PresentationID 12953
76 Attendees 64
77 Downloads 31
78 Questions 0
79 Likes 11
80 Tweets 0
81 Polls 0
143 PresentationID 12951
146 Attendees 64
147 Downloads 28
148 Questions 2
149 Likes 2
150 Tweets 0
151 Polls 0
我需要达到这种格式:
PresentationID Attendees Downloads Questions Likes Tweets Polls
0 12954 65 0 0 11 0 0
1 12953 64 31 6 0 4
2 12892 204 0 0 14 0 0
我尝试了几种groupby,pivot和stack的组合,但没有用。任何建议都非常感谢。感谢。
答案 0 :(得分:5)
print (df)
A B C
0 2 PresentationID 12954
1 5 Attendees 65
2 6 Downloads 0
3 7 Questions 0
4 8 Likes 11
5 9 Tweets 0
6 10 Polls 0
7 73 PresentationID 12953
8 76 Attendees 64
9 77 Downloads 31
10 78 Questions 0
11 79 Likes 11
12 80 Tweets 0
13 81 Polls 0
14 143 PresentationID 12951
15 146 Attendees 64
16 147 Downloads 28
17 148 Questions 2
18 149 Likes 2
19 150 Tweets 0
20 151 Polls 0
df['G'] = df.groupby('B').cumcount()
df = df.pivot(index='G', columns='B', values='C')
print (df)
B Attendees Downloads Likes Polls PresentationID Questions Tweets
G
0 65 0 11 0 12954 0 0
1 64 31 11 0 12953 0 0
2 64 28 2 0 12951 2 0
df = pd.pivot(index=df.groupby('B').cumcount(), columns=df.B, values=df.C)
print (df)
B Attendees Downloads Likes Polls PresentationID Questions Tweets
0 65 0 11 0 12954 0 0
1 64 31 11 0 12953 0 0
2 64 28 2 0 12951 2 0