Python - 重构Dataframe,将列名移动到行,重塑数据帧

时间:2017-04-04 14:59:26

标签: python pandas dataframe reshape reindex

我需要将df1转换为df2:

import pandas as pd
from pandas import DataFrame, Series

import numpy as np

df1 = pd.DataFrame(index=['date_1', 'date_2', 'date_3'], 
              columns=["A_count", "A_dollar", "B_count", "B_dollar"], 
              data=[[10,"$100",7,"$786"], [3,"$43",6,"$88"],     [5,"$565",8,"$876"]])
df1

enter image description here

基本上我需要的是将项目(A和B)作为标签放在新列中,然后在A项下每行移动第3和第4列数据。这将为我们提供每个日期的新行。

enter image description here

1 个答案:

答案 0 :(得分:1)

您可以通过使用下划线将列转换为多索引,然后使用stack将其重新整形为长格式:

df1.columns = df1.columns.str.split("_", expand=True)
df1.stack(level=0).rename_axis((None, "item")).reset_index("item")

enter image description here

如果列名中有多个下划线,如下所示:

df1 = pd.DataFrame(index=['date_1', 'date_2', 'date_3'], 
              columns=["A_x_count", "A_x_dollar", "B_y_count", "B_y_dollar"], 
              data=[[10,"$100",7,"$786"], [3,"$43",6,"$88"],     [5,"$565",8,"$876"]])
df1

enter image description here

您可以将rsplitn = 1一起使用,以便它只会拆分到最后一个下划线:

df1.columns = df1.columns.str.rsplit("_", n=1, expand=True)
df1.stack(level=0).rename_axis((None, "item")).reset_index("item")

enter image description here