我有一个像这样的数据框 df = pd.Dataframe({'year':[2001,2002,2001,2002,2003],'1':[36984,36559,12927,12414,9731],'2':[28384,33467,11677,11258,8407],'State':["Alabama","Alabama","Alaska","Alaska","Alaska"]})
:
year 1 2 State
2001 36984 28384 Alabama
2002 36559 33467 Alabama
2001 12927 11677 Alaska
2002 12414 11258 Alaska
2003 9731 8407 Alaska
。现在我想按照 df
将这个 State
组织到一个列组中,如下所示:
year-month Alabama Alaska
2001-1 36984 12927
2001-2 28384 11677
2002-1 36559 12414
2002-2 33467 11258
2003-1 NaN 9371
2003-2 NaN 8407
如何实现?谢谢。
答案 0 :(得分:3)
将 DataFrame.melt
与连接列一起使用,然后使用 DataFrame.pivot
:
df1 = df.melt(['year','State'])
df1['year-month'] = df1['year'].astype(str) + '-' + df1['variable'].astype(str)
df1 = df1.pivot('year-month','State','value')
print (df1)
State Alabama Alaska
year-month
2001-1 36984.0 12927.0
2001-2 28384.0 11677.0
2002-1 36559.0 12414.0
2002-2 33467.0 11258.0
2003-1 NaN 9731.0
2003-2 NaN 8407.0
答案 1 :(得分:3)
另一种方式:
out=(df.groupby(['State','year'])
.first()
.unstack(1)
.swaplevel(axis=1)
.T
.rename_axis(columns='year-month'))
out.index=out.index.map(lambda x:'-'.join(map(str,x)))
out
的输出:
year-month Alabama Alaska
2001-1 36984.0 12927.0
2002-1 36559.0 12414.0
2003-1 NaN 9731.0
2001-2 28384.0 11677.0
2002-2 33467.0 11258.0
2003-2 NaN 8407.0