将模态设置为新列

时间:2015-05-22 05:06:46

标签: python pandas

从这种数据::

d = { 'Media': {0: 'M1', 1: 'M2', 2: 'M7', 3: 'M1', 4: 'M2', 5: 'M1'},
         'pi': {0: 'p84', 1: 'p84', 2: 'p84', 3: 'p84', 4: 'p73', 5: 'p73'},
         'qj': {0: 'q82', 1: 'q4', 2: 'q5', 3: 'q2', 4: 'q23', 5: 'q9'},
     'Budget': {0: 39, 1: 15, 2: 13, 3: 53, 4: 82, 5: 70} }
dd = pd.DataFrame.from_dict(d)
dd
#    Budget Media   pi   qj
# 0      39    M1  p84  q82
# 1      15    M2  p84   q4
# 2      13    M7  p84   q5
# 3      53    M1  p84   q2
# 4      82    M2  p73  q23
# 5      70    M1  p73   q9

我需要创建 res 一个新的数据框,例如每个媒体模式预算都有一个列,例如:

res
#    Media   pi   qj Budget MediaM1 MediaM2 MediaM7
# 0     M1  p84  q82     39      39       0       0
# 1     M2  p84   q4     15       0      15       0
# 2     M7  p84   q5     13       0       0      13
# 3     M1  p84   q2     53      53       0       0
# 4     M2  p73  q23     82       0      82       0
# 5     M1  p73   q9     70      70       0       0

1 个答案:

答案 0 :(得分:3)

这是实现目标的方法。

dd['Budget'], dd['Media']

上获取交叉表
In [21]: cross = pd.crosstab(dd['Budget'], dd['Media'], values=dd['Budget'], aggfunc=sum)

In [22]: cross
Out[22]:
Media   M1  M2  M7
Budget
13     NaN NaN  13
15     NaN  15 NaN
39      39 NaN NaN
53      53 NaN NaN
70      70 NaN NaN
82     NaN  82 NaN

然后合并dd并用NaN's填充0

In [23]: dd.merge(cross.reset_index()).fillna(0)
Out[23]:
   Budget Media   pi   qj  M1  M2  M7
0      39    M1  p84  q82  39   0   0
1      15    M2  p84   q4   0  15   0
2      13    M7  p84   q5   0   0  13
3      53    M1  p84   q2  53   0   0
4      82    M2  p73  q23   0  82   0
5      70    M1  p73   q9  70   0   0