Pandas合并数据透视表的列

时间:2017-10-17 19:51:57

标签: python pandas

我生成了一个如下所示的pandas数据透视表:

                                                                 Encounters
                                                     Code   132   133  145
  Record Number  Start_date  End_date  Service_Date
    2322            1/1/2017  1/3/2017  1/1/2017             0     1    1
                                        1/2/2017             1     0    0
                                        1/3/2017             0     1    1

我想根据代码

合并并汇总一些数据透视表列

期望的输出:

                                                              Encounters
                                                     Code   132   133-145 
  Record Number  Start_date  End_date  Service_Date
    2322            1/1/2017  1/3/2017  1/1/2017             0      2    
                                        1/2/2017             1      0    
                                        1/3/2017             0      2    

1 个答案:

答案 0 :(得分:1)

数据透视表创建分层列(即多个级别)。因此,请考虑使用不同级别的元组分配来分配新的sum列:

df[('Encounters', '133-145')] = df[('Encounters', '133')] + df[('Encounters', '145')] 

del df[('Encounters', '133')] 
del df[('Encounters', '145')] 

df.sortlevel(0, axis=1, inplace=True)

用随机数据进行演示:

数据 (带有支点的种子数据)

import numpy as np
import pandas as pd
import datetime as dt
import time

LETTERS = list('ABCDEFGHIJKLMNOPQRSTUVWXYZ')    
epoch_time = int(time.time())

np.random.seed(555)
df = pd.DataFrame({'ID': [np.random.randint(15) for _ in range(50)],
                   'GROUP': ["".join(np.random.choice(LETTERS[0:3],1)) for _ in range(50)],
                   'NUM': np.random.uniform(50)/100,
                   'DATE': [dt.datetime.fromtimestamp(np.random.randint(low=1400270738,
                            high=epoch_time)) for _ in range(50)]})

df['YEAR'] = df['DATE'].dt.year
pvtdf = df.pivot_table(index = ['ID'], columns = ['YEAR', 'GROUP'], values = ['NUM']).fillna(0)

print(pvtdf)
#             NUM                                                                                                              
# YEAR       2014                          2015                          2016                          2017                    
# GROUP         A         B         C         A         B         C         A         B         C         A         B         C
# ID                                                                                                                           
# 0      0.000000  0.000000  0.000000  0.411258  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.411258  0.411258
# 1      0.000000  0.411258  0.000000  0.000000  0.000000  0.000000  0.411258  0.411258  0.000000  0.000000  0.411258  0.411258
# 3      0.411258  0.000000  0.000000  0.000000  0.000000  0.000000  0.411258  0.411258  0.411258  0.000000  0.411258  0.000000
# 4      0.411258  0.411258  0.000000  0.000000  0.411258  0.411258  0.000000  0.000000  0.411258  0.411258  0.000000  0.000000
# 5      0.411258  0.000000  0.000000  0.411258  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.411258
# 6      0.000000  0.411258  0.000000  0.000000  0.411258  0.411258  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000
# 7      0.000000  0.000000  0.000000  0.411258  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000
# 8      0.000000  0.411258  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.411258
# 9      0.000000  0.000000  0.411258  0.411258  0.000000  0.411258  0.411258  0.000000  0.000000  0.000000  0.000000  0.000000
# 10     0.000000  0.000000  0.000000  0.411258  0.411258  0.000000  0.000000  0.411258  0.000000  0.000000  0.000000  0.000000
# 11     0.000000  0.000000  0.000000  0.411258  0.000000  0.000000  0.000000  0.000000  0.411258  0.000000  0.000000  0.000000
# 12     0.411258  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000
# 13     0.000000  0.411258  0.000000  0.000000  0.000000  0.000000  0.411258  0.000000  0.411258  0.000000  0.411258  0.000000
# 14     0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.411258  0.000000

流程 (所有2017年A,B,C列都添加到D中然后删除)

pvtdf[('NUM', 2017, 'D')] = pvtdf[('NUM', 2017, 'A')] + pvtdf[('NUM', 2017, 'B')] + pvtdf[('NUM', 2017, 'C')]

pvtdf = pvtdf.drop([('NUM', 2017, 'A'), ('NUM', 2017, 'B'), ('NUM', 2017, 'C')], axis=1)    
pvtdf.sortlevel(0, axis=1, inplace=True)

print(pvtdf)    
#             NUM                                                                                          
# YEAR       2014                          2015                          2016                          2017
# GROUP         A         B         C         A         B         C         A         B         C         D
# ID                                                                                                       
# 0      0.000000  0.000000  0.000000  0.411258  0.000000  0.000000  0.000000  0.000000  0.000000  0.822515
# 1      0.000000  0.411258  0.000000  0.000000  0.000000  0.000000  0.411258  0.411258  0.000000  0.822515
# 3      0.411258  0.000000  0.000000  0.000000  0.000000  0.000000  0.411258  0.411258  0.411258  0.411258
# 4      0.411258  0.411258  0.000000  0.000000  0.411258  0.411258  0.000000  0.000000  0.411258  0.411258
# 5      0.411258  0.000000  0.000000  0.411258  0.000000  0.000000  0.000000  0.000000  0.000000  0.411258
# 6      0.000000  0.411258  0.000000  0.000000  0.411258  0.411258  0.000000  0.000000  0.000000  0.000000
# 7      0.000000  0.000000  0.000000  0.411258  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000
# 8      0.000000  0.411258  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.411258
# 9      0.000000  0.000000  0.411258  0.411258  0.000000  0.411258  0.411258  0.000000  0.000000  0.000000
# 10     0.000000  0.000000  0.000000  0.411258  0.411258  0.000000  0.000000  0.411258  0.000000  0.000000
# 11     0.000000  0.000000  0.000000  0.411258  0.000000  0.000000  0.000000  0.000000  0.411258  0.000000
# 12     0.411258  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000
# 13     0.000000  0.411258  0.000000  0.000000  0.000000  0.000000  0.411258  0.000000  0.411258  0.411258
# 14     0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.411258