将具有多个索引的DataFrame连接起来

时间:2018-08-01 18:18:23

标签: python pandas pandas-join

我正在使用一个标准数据框并创建摘要数据的各种子集数据框。这些子集都将被双索引,第一个索引相同。我被要求将所有摘要数据汇总在一起(他们想为所有摘要数据创建一个JSON)。我以为组合数据帧将是最简单的解决方案,但是我遇到了麻烦。

标准数据框示例:df

   ID   DEPT   STATUS    TYPE
0  100  5001   Active      E
1  101  5001   Active      M
2  101  5001   Active      M
3  102  5005   Expired     E
4  107  5001   Inactive    M
5  110  5002   Inactive    E
6  110  5002   Inactive    E

然后我创建摘要数据并重命名该列:

status_df = pd.DataFrame(df.groupby(['DEPT','STATUS'])['ID'].nunique())

status_df.columns = ['Count_Status']

               Count_Status
DEP  STATUS  
5001 Active    2
     Inactive  1 
5002 Inactive  1
5005 Expired   1

,然后在另一列:

type_df = pd.DataFrame(df.groupby(['DEPT','TYPE'])['ID'].nunique())

type_df.columns = ['Count_Type']

              Count_Type
DEP  TYPE
5001 E        1
     M        2 
5002 E        1
5005 E        1

我要创建的内容:

                     Count_Status   Count_Type
DEP  
     STATUS    TYPE
5001 Active          2              NaN
     Inactive        1              NaN
               E     NaN            1
               M     NaN            2
5002 Inactive        1              NaN
               E     NaN            1
5005 Expried         1              NaN
               E     NaN            1

1 个答案:

答案 0 :(得分:0)

您可以尝试使用pd.concatset_index

d1 = (df.groupby(['DEPT','STATUS'])['ID'].nunique()
        .rename('Count Status')
        .reset_index(level=1))

d2 = (df.groupby(['DEPT','TYPE'])['ID'].nunique()
        .rename('Count Type')
        .reset_index(level=1))

df_out = (pd.concat([d1, d2], sort=False)
            .set_index(['STATUS','TYPE'], append=True)
            .sort_index())
df_out

输出:

                    Count Status  Count Type
DEPT STATUS   TYPE                          
5001 Active   NaN            2.0         NaN
     Inactive NaN            1.0         NaN
     NaN      E              NaN         1.0
              M              NaN         2.0
5002 Inactive NaN            1.0         NaN
     NaN      E              NaN         1.0
5005 Expired  NaN            1.0         NaN
     NaN      E              NaN         1.0