对数据熊猫进行操作

时间:2018-10-08 06:45:34

标签: python pandas

我想创建两个新列(每年),每个列中每个国家和每年的比率为“否” /“是”。有人可以帮忙吗?

    Country  Jobs   2017    2018    
0   Spain    Yes    3885    5331
1   Spain    No     234     593 
2   Portugal Yes    1231    2424
3   Portugal No     241     124

预期的输出-

    Country  Jobs   2017    2018  Ratio2017 Ratio2018
0   Spain    Yes    3885    5331  0.06      0.11
1   Spain    No     234     593 
2   Portugal Yes    1231    2424  0.19      0.05
3   Portugal No     241     124

3 个答案:

答案 0 :(得分:3)

这是一种将所需比率作为单独表格计算的方法:

df_rearranged = df.set_index(['Country', 'Jobs']).unstack(level=0)
(df_rearranged.loc['No'] / df_rearranged.loc['Yes']).unstack().T
#              2017      2018
#Country                     
#Portugal  0.195776  0.051155
#Spain     0.060232  0.111236

需要花费少量的concatjoin才能将其添加到原始表中。

答案 1 :(得分:1)

使用:

#create MultiIndex
df1 = df.set_index(['Country','Jobs'])
#aggregate to unique Country Jobs rows if necessary
#df1 = df.sum(level=[0,1])
print (df1)
               2017  2018
Country  Jobs            
Spain    Yes   3885  5331
         No     234   593
Portugal Yes   1231  2424
         No     241   124

#select values by second vlevel and divide
df2 = df1.xs('No', level=1).div(df1.xs('Yes', level=1)).add_prefix('ratio')
print (df2)
          ratio2017  ratio2018
Country                       
Spain      0.060232   0.111236
Portugal   0.195776   0.051155

#add to original DataFrame
df = df.join(df2, on='Country')
print (df)
    Country Jobs  2017  2018  ratio2017  ratio2018
0     Spain  Yes  3885  5331   0.060232   0.111236
1     Spain   No   234   593   0.060232   0.111236
2  Portugal  Yes  1231  2424   0.195776   0.051155
3  Portugal   No   241   124   0.195776   0.051155

答案 2 :(得分:0)

这是一个完整的pivot_table()-

df1 = df.pivot_table(index='Country', columns='Jobs', values=['2017', '2018'])
Ratio_2017 = (df1['2017']['No'] / df1['2017']['Yes']).to_dict()
Ratio_2018 = (df1['2018']['No'] / df1['2018']['Yes']).to_dict()
df['Ratio_2017'] = df['Country'].map(Ratio_2017)
df['Ratio_2018'] = df['Country'].map(Ratio_2018)
print(df)

输出

    Country Jobs  2017  2018  Ratio_2017  Ratio_2018
0     Spain  Yes  3885  5331    0.060232    0.111236
1     Spain   No   234   593    0.060232    0.111236
2  Portugal  Yes  1231  2424    0.195776    0.051155
3  Portugal   No   241   124    0.195776    0.051155