我有Pandas
dataframe
,显示人们在1月和2月份花了多少钱。我想使用groupby
函数按人分组,但我的代码产生的是DataFrameGroupBy object
而不是实际的数据帧。我还有一个性别专栏,我只想留下来。
代码:
import pandas as pd
df = pd.DataFrame(data=[['Person A', 5, 21, 'Male'], ['Person B', 15, 3, 'Female']], columns=['Names', 'Jan', 'Feb', 'Gender'])
print df.groupby(['Names', 'Jan', 'Feb'])
输出:
<pandas.core.groupby.DataFrameGroupBy object at 0x020D4470>
启动数据帧:
Names Jan Feb Gender
0 Person A 5 21 Male
1 Person B 15 3 Female
期望的输出:
Names Value Gender
0 Person A - Jan 5 Male
1 Person A - Feb 21 Male
2 Person B - Jan 15 Female
3 Person B - Feb 3 Female
答案 0 :(得分:3)
您可以将melt
与sort_values
一起使用,然后将列标记为drop
列 public void HideTheDarnBars()
{
View decorView = Window.DecorView;
var uiOptions = (int)decorView.SystemUiVisibility;
uiOptions |= (int)SystemUiFlags.Fullscreen;
uiOptions |= (int)SystemUiFlags.HideNavigation;
uiOptions |= (int)SystemUiFlags.ImmersiveSticky;
uiOptions |= (int)SystemUiFlags.LayoutFullscreen;
uiOptions |= (int)SystemUiFlags.LayoutHideNavigation;
decorView.SystemUiVisibility = (StatusBarVisibility)uiOptions;
}
:
variable
使用assign
的另一个单行解决方案:
df1 = pd.melt(df, id_vars='Names').sort_values('Names')
df1['Names'] = df1['Names'] + '- ' + df1['variable']
df1 = df1.drop('variable', axis=1)
print df1
Names value
0 Person A- Jan 5
2 Person A- Feb 21
1 Person B- Jan 15
3 Person B- Feb 3
编辑:
您可以向参数print pd.melt(df, id_vars='Names').sort_values('Names')
.assign(Names = lambda x: x['Names'] + '- ' + x['variable'])
.drop('variable', axis=1)
Names value
0 Person A- Jan 5
2 Person A- Feb 21
1 Person B- Jan 15
3 Person B- Feb 3
添加新列:
id_vars
一行解决方案,如果您需要重新排序列,请使用reindex_axis
:
df1 = pd.melt(df, id_vars=['Names', 'Gender']).sort_values('Names')
df1['Names'] = df1['Names'] + '- ' + df1['variable']
df1 = df1.drop('variable', axis=1)
df1 = df1[['Names','value','Gender']]
print df1
Names value Gender
0 Person A- Jan 5 Male
2 Person A- Feb 21 Male
1 Person B- Jan 15 Female
3 Person B- Feb 3 Female
答案 1 :(得分:2)
另一种使用堆栈的解决方案。
df_out = df.set_index(['Names']).stack().to_frame().reset_index()
df_out.columns = ['Names','month','value']
修改强>
这也应该有效:
stack_df = df.set_index(['Names', 'Gender']).stack().to_frame().reset_index()
stack_df.columns = ['Names','Gender','Month', 'Value']