如何将groupby多索引转换为Pandas中的新列?

时间:2017-05-11 02:29:00

标签: python pandas dataframe

这里我有一个如下所示的DataFrame:

>>> import pandas as pd
>>> import numpy as np
>>> df = pd.DataFrame()
>>> df["user_id"] = [1,1,1,2,2,3,4,4,4,4]
>>> df["cate"] = ["a","b","c","b","c","a","a","b","c","d"]
>>> df["prob"] = [np.random.rand() for _ in range(len(df["user_id"]))]

enter image description here

我想将每个pro的{​​{1}}转换为用户的新列(cate),如下所示:

enter image description here

解决此问题的唯一解决方案是使用user_id,当我有数万名用户时,它的速度非常慢!

for loop

那么,Pandas有内置的方法来解决这个问题吗?非常感谢你!

1 个答案:

答案 0 :(得分:2)

Pivot在这里工作得非常好

<body ng-app="app">
<div ng-controller="main">
  <hr>
  <div ng-repeat="section in sections track by section.sectionid" ng-model="section.sectionid">
    <p>Section</p>
    <div ng-model="typeModel" ng-repeat="type in types | filter:section.sectionid">
      <p>item name</p>
      <select ng-model="itemSelected" ng-options="item.name for item in items track by item.id">

      </select>

    </div>
    <hr>
  </div>

</div>

你得到了

df.pivot('user_id', 'cate', 'prob').reset_index().fillna(0)

使用set_index

的另一种方法
cate    user_id a           b           c           d
0       1       0.853583    0.161935    0.388652    0.000000
1       2       0.000000    0.554185    0.177939    0.000000
2       3       0.700654    0.000000    0.000000    0.000000
3       4       0.781307    0.634584    0.861808    0.130701

你得到相同的结果