需要根据条件向多级数据框中添加列
我需要添加另一列 Bill3 ,这是 bill1&bill2 的总和 注释列为空白
data_frame1 = pd.pivot_table(data_frame, index=['PC', 'Geo', 'Comp'], values=['Bill1', 'Bill2'], columns=['Month'], fill_value=0)
data_frame1 = data_frame1.swaplevel(0,1, axis=1).sort_index(axis=1)
tuples = [(a.strftime('%b-%y'), b) if a!= 'All' else (a,b) for a,b in data_frame1.columns]
data_frame1.columns = pd.MultiIndex.from_tuples(tuples)
输出:
Sep-19 OCT-19 Nov-19
Bill1 Bill2 Bill1 Bill2 Bill1 Bill2
PC Geo Comp
A Ind OS 1 1.28 1 1.28 1 1.28
所需的输出:
Sep-19 OCT-19 Nov-19
Bill1 Bill2 Bill3 comment Bill1 Bill2 Bill3 comment Bill1 Bill2 Bill3 comment
PC Geo Comp
A Ind OS 1 1.28 2.28 1 1.28 2.28 1 1.28 2.28
答案 0 :(得分:2)
使用:
#sum all columns
df1 = df.sum(level=0, axis=1)
#sum only bill1, bill2 columns
#df1 = df.loc[:, df.columns.get_level_values(1).isin(['Bill1','Bill2'])].sum(level=0, axis=1)
#create empty df for comment
df2 = pd.DataFrame(columns=pd.MultiIndex.from_product([df1.columns.tolist(),
['comment']]), index=df.index)
#add MultiIndex for bill3
df1.columns = pd.MultiIndex.from_product([df1.columns.tolist(), ['Bill3']])
#join together
df = pd.concat([df, df1, df2], axis=1).sort_index(axis=1)
print (df)
Nov-19 Oct-19 Sep-19 \
Bill1 Bill2 Bill3 comment Bill1 Bill2 Bill3 comment Bill1 Bill2
A Ind OS 1 1.28 2.28 NaN 1 1.28 2.28 NaN 1 1.28
Bill3 comment
A Ind OS 2.28 NaN
如果订购很重要,请将第一级转换为日期时间:
df.columns = [pd.to_datetime(df.columns.get_level_values(0), format='%b-%y'),
df.columns.get_level_values(1)]
df1 = df.sum(level=0, axis=1)
df2 = pd.DataFrame(columns=pd.MultiIndex.from_product([df1.columns.tolist(),
['comment']]), index=df.index)
df1.columns = pd.MultiIndex.from_product([df1.columns.tolist(), ['Bill3']])
df = pd.concat([df, df1, df2], axis=1).sort_index(axis=1)
print (df)
2019-09-01 2019-10-01 \
Bill1 Bill2 Bill3 comment Bill1 Bill2 Bill3 comment
A Ind OS 1 1.28 2.28 NaN 1 1.28 2.28 NaN
2019-11-01
Bill1 Bill2 Bill3 comment
A Ind OS 1 1.28 2.28 NaN