我有Multiindex数据框,我想为其重新编制索引。但是,出现“重复轴错误”。
Product Date col1
A September 2019 5
October 2019 7
B September 2019 2
October 2019 4
如何实现这样的输出?
Product Date col1
A January 2019 0
February 2019 0
March 2019 0
April 2019 0
May 2019 0
June 2019 0
July 2019 0
August 2019 0
September 2019 5
October 2019 7
B January 2019 0
February 2019 0
March 2019 0
April 2019 0
May 2019 0
June 2019 0
July 2019 0
August 2019 0
September 2019 2
October 2019 4
首先,我尝试了这一点:
nested_df = nested_df.reindex(annual_date_range, level = 1, fill_value = 0)
第二,
nested_df = nested_df.reset_index().set_index('Date')
nested_df = nested_df.reindex(annual_date_range, fill_value = 0)
答案 0 :(得分:0)
您应该每月执行以下操作:
df.loc[('A', 'January 2019'), :] = (0)
df.loc[('B', 'January 2019'), :] = (0)
答案 1 :(得分:0)
让df1
为第一个非零值的数据帧。该方法是创建另一个具有零值的数据帧df
,然后合并两个数据帧以获得结果。
dates = ['{month}-2019'.format(month=month) for month in range(1,9)]*2
length = int(len(dates)/2)
products = ['A']*length + ['B']*length
Col1 = [0]*len(dates)
df = pd.DataFrame({'Dates': dates, 'Products': products, 'Col1':Col1}).set_index(['Products','Dates'])
现在,MultiIndex已转换为日期时间:
df.index.set_levels(pd.to_datetime(df.index.get_level_values(1)[:8]).strftime('%m-%Y'), level=1,inplace=True)
在df1
中,您必须执行相同的操作,即将日期时间多索引级别更改为相同的格式:
df1.index.set_levels(pd.to_datetime(df1.index.get_level_values(1)[:2]).strftime('%m-%Y'), level=1,inplace=True)
之所以这样做,是因为否则(例如,如果日期时间格式为%B %y
),则按月对MultiIndex进行排序是错误的。现在,将两个数据帧合并就足够了:
result = pd.concat([df1,df]).sort_values(['Products','Dates'])
最后一步是更改日期时间格式:
result.index.set_levels(levels = pd.to_datetime(result.index.get_level_values(1)[:10]).strftime('%B %Y'), level=1, inplace=True)