基于Pandas中groupby的数据透视表

时间:2018-07-26 15:01:30

标签: python-3.x pandas dataframe group-by pivot-table

我有一个这样的数据框:

const routes: Routes = [
    {
        path: "",
        component: BaseComponent,
        canActivate: [AuthGuardService],
        children: [
            {path: "", redirectTo: "/boards", pathMatch: "full"},
            {path: "boards", component: BoardsComponent},
            {
                path: "admin",
                component: AdminComponent,
                canActivate: [AdminGuardService]
            },
            {
                path: "admin/users",
                component: AdminUsersComponent,
                canActivate: [AdminGuardService]
            }
        ]
    },
    {
        path: "login",
        component: LoginComponent
    }
];

我想将customer_id | date | category 1 | 2017-2-1 | toys 2 | 2017-2-1 | food 1 | 2017-2-1 | drinks 3 | 2017-2-2 | computer 2 | 2017-2-1 | toys 1 | 2017-3-1 | food 列的值设为新列,并对其中的列进行热编码,我知道我可以使用category,我也想按df.pivot_table(index = ['customer_id'], columns = ['category'])分组,因此每一行仅包含来自同一日期的信息,例如在下面的所需输出中,id 1有两行,因为date列中有两个唯一的日期。

date

2 个答案:

答案 0 :(得分:2)

您可能正在寻找crosstab

pd.crosstab([df.customer_id,df.date],df.category).reset_index(level=1,drop=True)
Out[102]: 
category     computer  drinks  food  toys
customer_id                              
1                   0       1     0     1
1                   0       0     1     0
2                   0       0     1     1
3                   1       0     0     0

答案 1 :(得分:0)

假设您的框架称为df,则可以添加一个指标列,然后直接使用.pivot_table

df['Indicator'] = 1

pvt = df.pivot_table(index=['date', 'customer_id'],
                     columns='category',
                     values='Indicator')\
        .fillna(0)

这将提供一个数据框,如下所示:

category              computer  drinks  food  toys
date     customer_id                              
2017-2-1 1                 0.0     1.0   0.0   1.0
         2                 0.0     0.0   1.0   1.0
2017-2-2 3                 1.0     0.0   0.0   0.0
2017-3-1 1                 0.0     0.0   1.0   0.0