熊猫数据透视表-显示相同列索引下的值

时间:2019-05-16 08:47:41

标签: pandas dataframe

我在构造熊猫数据透视表时遇到了麻烦。 我希望在同一列['Balance', 'WAP']下有两个值['Delivery']

这是从字典构造的DataFrame:

dict_data = {
    'Contract' : ['Contract 1', 'Contract 2', 'Contract 3', 'Contract 4'],
    'Contract_Date': ['01/01/2019', '02/02/2019', '03/03/2019', '04/03/2019'],
    'Delivery' : ['2019-01', '2019-01', '2019-02', '2019-03'],
    'Price' : [90, 95, 100, 105],
    'Balance': [50, 100, 150, 200]
}

df = pd.DataFrame.from_dict(dict_data)

df

DataFrame:

    Contract    Contract_Date   Delivery    Price   Balance
0   Contract 1  01/01/2019       2019-01    90      50
1   Contract 2  02/02/2019       2019-01    95      100
2   Contract 3  03/03/2019       2019-02    100     150
3   Contract 4  04/03/2019       2019-03    105     200

计算加权平均价格:

# Create WAP - Weighted Average Price
df['Value'] = df['Balance'] * df['Price'] 
df['WAP'] = df['Value'] / df['Balance']
df

数据透视表构造:

# Use a dictionary to apply more than 1 type of aggregate onto the data
f = {'Balance': ['sum'], 'WAP': ['mean']}

df.pivot_table(
    columns='Delivery',
    values=['Balance', 'WAP'],
    index=['Contract_Date', 'Contract'],
    aggfunc=f
).replace(np.nan, '')

pivot 我试图将2个值显示在同一列下,以便进行比较,例如下表(手动构造):

              Delivery   2019-01          2019-02          2019-03 
Contract Date Contract   Balance     WAP  Balance   WAP    Balance     WAP 
01/01/2019    Contract 1 50          90      
02/02/2019    Contract 2 100         95     
03/03/2019    Contract 3                  150       100
04/03/2019    Contract 4                                   200         105

是否正在考虑沿堆栈/堆栈的某个地方解决此问题?非常感谢您的帮助,因为我还不熟悉Pandas。

1 个答案:

答案 0 :(得分:1)

首先从字典中将一个元素列表转换为字符串,以避免3级MultiIndex:

f = {'Balance': 'sum', 'WAP': 'mean'}

然后将DataFrame.swaplevelDataFrame.sort_index结合使用:

f = {'Balance': 'sum', 'WAP': 'mean'}

df = (df.pivot_table(
    columns='Delivery',
    values=['Balance', 'WAP'],
    index=['Contract_Date', 'Contract'],
    aggfunc=f
     ).replace(np.nan, '')
       .swaplevel(1,0, axis=1)
       .sort_index(axis=1))
print (df)
Delivery                 2019-01     2019-02      2019-03     
                         Balance WAP Balance  WAP Balance  WAP
Contract_Date Contract                                        
01/01/2019    Contract 1      50  90                          
02/02/2019    Contract 2     100  95                          
03/03/2019    Contract 3                 150  100             
04/03/2019    Contract 4                              200  105