我想创建两个新列(每年),每个列中每个国家和每年的比率为“否” /“是”。有人可以帮忙吗?
Country Jobs 2017 2018
0 Spain Yes 3885 5331
1 Spain No 234 593
2 Portugal Yes 1231 2424
3 Portugal No 241 124
预期的输出-
Country Jobs 2017 2018 Ratio2017 Ratio2018
0 Spain Yes 3885 5331 0.06 0.11
1 Spain No 234 593
2 Portugal Yes 1231 2424 0.19 0.05
3 Portugal No 241 124
答案 0 :(得分:3)
这是一种将所需比率作为单独表格计算的方法:
df_rearranged = df.set_index(['Country', 'Jobs']).unstack(level=0)
(df_rearranged.loc['No'] / df_rearranged.loc['Yes']).unstack().T
# 2017 2018
#Country
#Portugal 0.195776 0.051155
#Spain 0.060232 0.111236
需要花费少量的concat
或join
才能将其添加到原始表中。
答案 1 :(得分:1)
使用:
#create MultiIndex
df1 = df.set_index(['Country','Jobs'])
#aggregate to unique Country Jobs rows if necessary
#df1 = df.sum(level=[0,1])
print (df1)
2017 2018
Country Jobs
Spain Yes 3885 5331
No 234 593
Portugal Yes 1231 2424
No 241 124
#select values by second vlevel and divide
df2 = df1.xs('No', level=1).div(df1.xs('Yes', level=1)).add_prefix('ratio')
print (df2)
ratio2017 ratio2018
Country
Spain 0.060232 0.111236
Portugal 0.195776 0.051155
#add to original DataFrame
df = df.join(df2, on='Country')
print (df)
Country Jobs 2017 2018 ratio2017 ratio2018
0 Spain Yes 3885 5331 0.060232 0.111236
1 Spain No 234 593 0.060232 0.111236
2 Portugal Yes 1231 2424 0.195776 0.051155
3 Portugal No 241 124 0.195776 0.051155
答案 2 :(得分:0)
这是一个完整的pivot_table()
-
df1 = df.pivot_table(index='Country', columns='Jobs', values=['2017', '2018'])
Ratio_2017 = (df1['2017']['No'] / df1['2017']['Yes']).to_dict()
Ratio_2018 = (df1['2018']['No'] / df1['2018']['Yes']).to_dict()
df['Ratio_2017'] = df['Country'].map(Ratio_2017)
df['Ratio_2018'] = df['Country'].map(Ratio_2018)
print(df)
输出
Country Jobs 2017 2018 Ratio_2017 Ratio_2018
0 Spain Yes 3885 5331 0.060232 0.111236
1 Spain No 234 593 0.060232 0.111236
2 Portugal Yes 1231 2424 0.195776 0.051155
3 Portugal No 241 124 0.195776 0.051155