我有一个我试图转动的DataFrame。
df
col_1 col_2 col_3 col_4
John Method 4 White
Tom Method 29613 White
Harry Method 147 White
John Method 84 Blue
Tom Method 28 Blue
John Method 222085 Black
Tom Method 159459 Black
Harry Method 2204225 Black
John Method 600253 Green
Tom Method 3156210 Green
Harry Method 4343635 Green
Harry Method 4343635 Green
预期结果:
newDf
Black Blue Green White
Harry 2204225 8687270 147
John 222085 84 600253 4
Tom 159459 28 3156210 29613
我的代码:
newDf = pd.pivot_table(df, values='col_3', index=['col_1'], columns=['col_4'], aggfunc={'col_3' : 'sum'})
列类型如下:
df.dtypes
col_1 object
col_2 object
col_3 int64
col_4 object
dtype: object
错误:
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
有人可以帮我吗?谢谢!
答案 0 :(得分:2)
In [54]: df.pivot_table(values='col_3', index='col_1', columns='col_4', aggfunc='sum', fill_value=0)
Out[54]:
col_4 Black Blue Green White
col_1
Harry 2204225 0 8687270 147
John 222085 84 600253 4
Tom 159459 28 3156210 29613
如果你想用空字符串替换NaN:
In [55]: df.pivot_table(values='col_3', index='col_1', columns='col_4', aggfunc='sum', fill_value='')
Out[55]:
col_4 Black Blue Green White
col_1
Harry 2204225 8687270 147
John 222085 84 600253 4
Tom 159459 28 3156210 29613
但结果是包含此类空字符串的列将为string
(object
)dtype:
In [56]: df.pivot_table(values='col_3', index='col_1', columns='col_4', aggfunc='sum', fill_value='').dtypes
Out[56]:
col_4
Black int64
Blue object
Green int64
White int64
dtype: object
答案 1 :(得分:2)
您已将dict传递给aggfunc
,然后您不需要将该列指向value
pd.pivot_table(df,index=['col_1'], columns=['col_4'], aggfunc={'col_3' : 'sum'})
Out[564]:
col_3
col_4 Black Blue Green White
col_1
Harry 2204225.0 NaN 8687270.0 147.0
John 222085.0 84.0 600253.0 4.0
Tom 159459.0 28.0 3156210.0 29613.0
答案 2 :(得分:1)
在您的数据透视之前,您需要处理缺失的值。 df.fillna(0, inplace=True)
会做到这一点。