I have the following dataframe
01/01/2017 02/01/2017
Productid ProductName Sales Discount Sales Discount
1 abc 100 12 234 23
2 xyz 156 13 237 13
3 pqr 300 12 198 18
I need to convert this into the following dataframe.
Productid ProductName Date Sales Discount
1 abc 01/01/2017 100 12
1 abc 02/01/2017 234 23
2 xyz 01/01/2017 156 13
2 xyz 02/01/2017 237 13
3 pqr 01/01/2017 300 12
3 pqr 02/01/2017 198 18
How can I do this in Python?
答案 0 :(得分:1)
难以直接复制多索引。因此,首先按照OP的原始数据帧初始化数据帧。
df = pd.read_clipboard() #reading part of OP's Dataframe
df
Productid ProductName Sales Discount Sales.1 Discount.1
0 1 abc 100 12 234 23
1 2 xyz 156 13 237 13
2 3 pqr 300 12 198 18
df.columns = ['Productid', 'ProductName', 'Sales', 'Discount', 'Sales', 'Discount']
df.set_index(keys=['Productid','ProductName'],inplace=True)
df
Sales Discount Sales Discount
Productid ProductName
1 abc 100 12 234 23
2 xyz 156 13 237 13
3 pqr 300 12 198 18
array = [['01/01/2017','01/01/2017','02/01/2017','02/01/2017'],
['Sales', 'Discount', 'Sales', 'Discount']]
df.columns = pd.MultiIndex.from_arrays(array) #setting multi-index
假设这是OP的数据框:
df
01/01/2017 02/01/2017
Sales Discount Sales Discount
Productid ProductName
1 abc 100 12 234 23
2 xyz 156 13 237 13
3 pqr 300 12 198 18
使用stack
和level=0
参数,然后再次在level=[0,1]
和reset_index()
上使用reset_index()
的解决方案。最后使用rename
将index
列的名称更改为Date
:
df = df.stack(level=0).reset_index(level=[0,1]).reset_index()
df.rename(columns={'index':'Date'},inplace=True)
df[['Productid', 'ProductName','Date','Sales','Discount']]
Productid ProductName Date Sales Discount
0 1 abc 01/01/2017 100 12
1 1 abc 02/01/2017 234 23
2 2 xyz 01/01/2017 156 13
3 2 xyz 02/01/2017 237 13
4 3 pqr 01/01/2017 300 12
5 3 pqr 02/01/2017 198 18