从这种数据::
d = { 'Media': {0: 'M1', 1: 'M2', 2: 'M7', 3: 'M1', 4: 'M2', 5: 'M1'},
'pi': {0: 'p84', 1: 'p84', 2: 'p84', 3: 'p84', 4: 'p73', 5: 'p73'},
'qj': {0: 'q82', 1: 'q4', 2: 'q5', 3: 'q2', 4: 'q23', 5: 'q9'},
'Budget': {0: 39, 1: 15, 2: 13, 3: 53, 4: 82, 5: 70} }
dd = pd.DataFrame.from_dict(d)
dd
# Budget Media pi qj
# 0 39 M1 p84 q82
# 1 15 M2 p84 q4
# 2 13 M7 p84 q5
# 3 53 M1 p84 q2
# 4 82 M2 p73 q23
# 5 70 M1 p73 q9
我需要创建 res 一个新的数据框,例如每个媒体模式预算都有一个列,例如:
res
# Media pi qj Budget MediaM1 MediaM2 MediaM7
# 0 M1 p84 q82 39 39 0 0
# 1 M2 p84 q4 15 0 15 0
# 2 M7 p84 q5 13 0 0 13
# 3 M1 p84 q2 53 53 0 0
# 4 M2 p73 q23 82 0 82 0
# 5 M1 p73 q9 70 70 0 0
答案 0 :(得分:3)
这是实现目标的方法。
在dd['Budget'], dd['Media']
In [21]: cross = pd.crosstab(dd['Budget'], dd['Media'], values=dd['Budget'], aggfunc=sum)
In [22]: cross
Out[22]:
Media M1 M2 M7
Budget
13 NaN NaN 13
15 NaN 15 NaN
39 39 NaN NaN
53 53 NaN NaN
70 70 NaN NaN
82 NaN 82 NaN
然后合并dd
并用NaN's
填充0
In [23]: dd.merge(cross.reset_index()).fillna(0)
Out[23]:
Budget Media pi qj M1 M2 M7
0 39 M1 p84 q82 39 0 0
1 15 M2 p84 q4 0 15 0
2 13 M7 p84 q5 0 0 13
3 53 M1 p84 q2 53 0 0
4 82 M2 p73 q23 0 82 0
5 70 M1 p73 q9 70 0 0