我有以下数据框:
df = pd.DataFrame([[1,1,1,1,1,1,1,1,2,2,2,2,3,3,3,3,3,3,3],['A','B','B','B','C','D','D','E','A','C','C','C','A','B','B','B','B','D','E'], [18,25,47,27,31,55,13,19,73,55,58,14,2,46,33,35,24,60,7]]).T
df.columns = ['Brand_ID','Category','Price']
Brand_ID Category Price
0 1 A 18
1 1 B 25
2 1 B 47
3 1 B 27
4 1 C 31
5 1 D 55
6 1 D 13
7 1 E 19
8 2 A 73
9 2 C 55
10 2 C 58
11 2 C 14
12 3 A 2
13 3 B 46
14 3 B 33
15 3 B 35
16 3 B 24
17 3 D 60
18 3 E 7
我需要做的是按Brand_ID,类别和计数进行分组(类似于this question的第一部分)。但是,我需要根据类别将输出写入不同的列。所以我的输出应如下所示:
Brand_ID Category_A Category_B Category_C Category_D Category_E
0 1 1 3 1 2 1
1 2 1 0 3 0 0
2 3 1 4 0 1 1
是否有可能直接对熊猫进行此操作?
答案 0 :(得分:3)
尝试:
git checkout -b new-feature origin/master
git cherry-pick old-feature~3..old-feature
git branch -f old-feature old-feature~3
输出
df.groupby(['Brand_ID','Category'])['Price'].count()\
.unstack(fill_value=0)\
.add_prefix('Category_')\
.reset_index()\
.rename_axis([None], axis=1)
OR
Brand_ID Category_A Category_B Category_C Category_D Category_E
0 1 1 3 1 2 1
1 2 1 0 3 0 0
2 3 1 4 0 1 1
答案 1 :(得分:3)
您正在描述pivot_table
:
df.pivot_table(index='Brand_ID', columns='Category', aggfunc='size', fill_value=0)
输出:
Category A B C D E
Brand_ID
1 1 3 1 2 1
2 1 0 3 0 0
3 1 4 0 1 1