重新索引Multiindex数据框

时间:2019-12-25 09:15:23

标签: pandas multi-index reindex

我有Multiindex数据框,我想为其重新编制索引。但是,出现“重复轴错误”。

Product  Date            col1
A        September 2019     5
         October 2019       7
B        September 2019     2
         October 2019       4

如何实现这样的输出?

Product  Date            col1
A        January 2019      0
         February 2019     0
         March 2019        0
         April 2019        0
         May 2019          0
         June 2019         0
         July 2019         0
         August 2019       0
         September 2019    5
         October 2019      7
B        January 2019      0
         February 2019     0
         March 2019        0
         April 2019        0
         May 2019          0
         June 2019         0
         July 2019         0
         August 2019       0
         September 2019    2
         October 2019      4 

首先,我尝试了这一点:

nested_df = nested_df.reindex(annual_date_range, level = 1, fill_value = 0)

第二,

nested_df = nested_df.reset_index().set_index('Date')
nested_df  = nested_df.reindex(annual_date_range, fill_value = 0)

2 个答案:

答案 0 :(得分:0)

您应该每月执行以下操作:

df.loc[('A', 'January 2019'), :] = (0)
df.loc[('B', 'January 2019'), :] = (0)

答案 1 :(得分:0)

df1为第一个非零值的数据帧。该方法是创建另一个具有零值的数据帧df,然后合并两个数据帧以获得结果。

dates = ['{month}-2019'.format(month=month) for month in range(1,9)]*2
length = int(len(dates)/2)
products = ['A']*length + ['B']*length
Col1 = [0]*len(dates)
df = pd.DataFrame({'Dates': dates, 'Products': products, 'Col1':Col1}).set_index(['Products','Dates'])

现在,MultiIndex已转换为日期时间:

df.index.set_levels(pd.to_datetime(df.index.get_level_values(1)[:8]).strftime('%m-%Y'), level=1,inplace=True)

df1中,您必须执行相同的操作,即将日期时间多索引级别更改为相同的格式:

df1.index.set_levels(pd.to_datetime(df1.index.get_level_values(1)[:2]).strftime('%m-%Y'), level=1,inplace=True)

之所以这样做,是因为否则(例如,如果日期时间格式为%B %y),则按月对MultiIndex进行排序是错误的。现在,将两个数据帧合并就足够了:

result = pd.concat([df1,df]).sort_values(['Products','Dates'])

最后一步是更改日期时间格式:

result.index.set_levels(levels = pd.to_datetime(result.index.get_level_values(1)[:10]).strftime('%B %Y'), level=1, inplace=True)
相关问题