我知道这对于groupby& /或pivot_table& /或堆栈来说很容易 - 我似乎无法将它从基础上取下来。我的笔记还没有告诉我如何做到这一点。我以为我接近了pandas docs中的pivot_table - 但是不能让它做到甚至一个级别 - 更不用说2.因为我没有尝试聚合任何东西。我的笔记都在做聚合......
感激地接受任何建议
创建第一个数据帧的代码:
df2 = pd.DataFrame({'CPC_qtr_root': {13: 0.13493790567404607,
14: 0.14353736611331172,
15: 0.10359919568913414,
16: 0.077153346715340618,
17: 0.066759430932458397,
39: 0.12067193385680651,
40: 0.049033000970486448,
41: 0.047640864406214359,
42: 0.040086869604689483,
43: 0.038795815932666726,
100: 0.11017683494905577,
101: 0.15510499735697988,
102: 0.16478351543691827,
103: 0.091894700285988867,
104: 0.0359603120618152},
'Country': {13: u'Afghanistan',
14: u'Afghanistan',
15: u'Afghanistan',
16: u'Afghanistan',
17: u'Afghanistan',
39: u'Albania',
40: u'Albania',
41: u'Albania',
42: u'Albania',
43: u'Albania',
100: u'Angola',
101: u'Angola',
102: u'Angola',
103: u'Angola',
104: u'Angola'},
'IncomeLevel': {13: 'Lower Income',
14: 'Lower Income',
15: 'Lower Income',
16: 'Lower Income',
17: 'Lower Income',
39: 'Upper Middle Income',
40: 'Upper Middle Income',
41: 'Upper Middle Income',
42: 'Upper Middle Income',
43: 'Upper Middle Income',
100: 'Lower Middle Income',
101: 'Lower Middle Income',
102: 'Lower Middle Income',
103: 'Lower Middle Income',
104: 'Lower Middle Income'},
'Rate': {13: 27.0,
14: 37.0,
15: 35.0,
16: 39.0,
17: 48.0,
39: 95.0,
40: 95.0,
41: 96.0,
42: 93.0,
43: 96.0,
100: 36.0,
101: 65.0,
102: 66.0,
103: 52.0,
104: 52.0},
'Year': {13: 2000,
14: 2001,
15: 2002,
16: 2003,
17: 2004,
39: 2000,
40: 2001,
41: 2002,
42: 2003,
43: 2004,
100: 2000,
101: 2001,
102: 2002,
103: 2003,
104: 2004}})
答案 0 :(得分:4)
df3 = df2.set_index(['Year','Country']).stack().unstack(1)
print (df3)
Country Afghanistan Albania Angola
Year
2000 CPC_qtr_root 0.134938 0.120672 0.110177
IncomeLevel Lower Income Upper Middle Income Lower Middle Income
Rate 27 95 36
2001 CPC_qtr_root 0.143537 0.049033 0.155105
IncomeLevel Lower Income Upper Middle Income Lower Middle Income
Rate 37 95 65
2002 CPC_qtr_root 0.103599 0.0476409 0.164784
IncomeLevel Lower Income Upper Middle Income Lower Middle Income
Rate 35 96 66
2003 CPC_qtr_root 0.0771533 0.0400869 0.0918947
IncomeLevel Lower Income Upper Middle Income Lower Middle Income
Rate 39 93 52
2004 CPC_qtr_root 0.0667594 0.0387958 0.0359603
IncomeLevel Lower Income Upper Middle Income Lower Middle Income
Rate 48 96 52
获取混合类型:
print (df3.head().applymap(type))
Country Afghanistan Albania Angola
Year
2000 CPC_qtr_root <class 'float'> <class 'float'> <class 'float'>
IncomeLevel <class 'str'> <class 'str'> <class 'str'>
Rate <class 'float'> <class 'float'> <class 'float'>
2001 CPC_qtr_root <class 'float'> <class 'float'> <class 'float'>
IncomeLevel <class 'str'> <class 'str'> <class 'str'>
答案 1 :(得分:0)
您可以先使用Year
和Country
作为ID,将IncomeLevel
,CPC_qtr_root
和Rate
作为值,将数据框从宽到长融合:
df3 = pd.melt(df2, id_vars=['Year', 'Country'], value_vars=['IncomeLevel', 'CPC_qtr_root', 'Rate'])
然后你可以转动你的桌子:
pd.pivot_table(df3, index = ['Year', 'variable'],
columns = 'Country',
values = 'value',
aggfunc = np.sum,
fill_value = 0)
返回:
Country Afghanistan Albania Angola
Year variable
2000 CPC_qtr_root 0.134938 0.120672 0.110177
IncomeLevel Lower Income Upper Middle Income Lower Middle Income
Rate 27 95 36
2001 CPC_qtr_root 0.143537 0.049033 0.155105
IncomeLevel Lower Income Upper Middle Income Lower Middle Income
Rate 37 95 65
2002 CPC_qtr_root 0.103599 0.0476409 0.164784
IncomeLevel Lower Income Upper Middle Income Lower Middle Income
Rate 35 96 66
2003 CPC_qtr_root 0.0771533 0.0400869 0.0918947
IncomeLevel Lower Income Upper Middle Income Lower Middle Income
Rate 39 93 52
2004 CPC_qtr_root 0.0667594 0.0387958 0.0359603
IncomeLevel Lower Income Upper Middle Income Lower Middle Income
Rate 48 96 52