我的手机使用情况和帐单数据排列在Pandas
dataframe
,其中包含两个月数据的统计信息。我想转动数据,以便每个月的列成为行。
起点:
Name Jan Minutes Used Feb Minutes Used Jan Bill Paid Feb Bill Paid
0 Person A 10 11 Yes No
1 Person B 12 13 No Yes
期望的输出:
Name Month Minutes Used Bill Paid
0 Person A Jan 10 Yes
1 Person A Feb 11 No
2 Person B Jan 12 No
3 Person B Feb 13 Yes
我正在尝试使用.melt()
来转置数据,但Bill Paid和Minutes Used数据会被放在同一列中,它们应该分成两列。
我的代码:
import pandas as pd
df = pd.DataFrame(data=[['Person A', 10, 11, 'Yes', 'No'], ['Person B', 12, 13, 'No', 'Yes']], columns=['Name', 'Jan Minutes Used', 'Feb Minutes Used', 'Jan Bill Paid', 'Feb Bill Paid'])
melted_df = pd.melt(df.reset_index(),
id_vars=['Name'],
value_vars=['Jan Bill Paid','Feb Bill Paid', 'Jan Minutes Used', 'Feb Minutes Used'])
melted_df['variable'] = melted_df['variable'].str.replace(' Minutes Used', '').str.replace(' Bill Paid', '')
melted_df.columns = ['Name', 'Month', 'Bill Paid']
print melted_df
我的代码输出:
Name Month Bill Paid
0 Person A Jan Yes
1 Person B Jan No
2 Person A Feb No
3 Person B Feb Yes
4 Person A Jan 10
5 Person B Jan 12
6 Person A Feb 11
7 Person B Feb 13
答案 0 :(得分:4)
您可以通过构建多索引然后使用堆栈来实现此目的:
In [31]: df = df.set_index(['Name', 'Gender'])
# split column names on first space and create multi-index (expand=True)
In [33]: df.columns = df.columns.str.split(' ', n=1, expand=True)
In [34]: df
Out[34]:
Jan Feb Jan Feb
Minutes Used Minutes Used Bill Paid Bill Paid
Name Gender
Person A Male 10 11 Yes No
Person B Female 12 13 No Yes
# stack (move from columns to index) the first (0) level of the columns
In [35]: df = df.stack(0)
In [36]: df
Out[36]:
Bill Paid Minutes Used
Name Gender
Person A Male Feb No 11
Jan Yes 10
Person B Female Feb Yes 13
Jan No 12
要显示相同的输出(全部在列中):
In [37]: df.reset_index()
Out[37]:
Name Gender level_2 Bill Paid Minutes Used
0 Person A Male Feb No 11
1 Person A Male Jan Yes 10
2 Person B Female Feb Yes 13
3 Person B Female Jan No 12