如何在整个索引中应用value_counts并创建新的数据框?

时间:2019-01-06 16:12:59

标签: python python-3.x pandas dataframe

我有以下数据框:

df = pd.DataFrame({
'name': {0: 'Silvers Park', 1: 'Adare Road', 2: 'Cargo Road'}, 
'type_2': {0: 'Secondary', 1: 'Special', 2: 'Secondary'}, 
'type_3': {0: 'Nursery', 1: nan, 2: nan}, 
'type_4': {0: 'Primary', 1: nan, 2: nan}, 
'type_5': {0: nan, 1: nan, 2: nan}, 
'type_6': {0: nan, 1: nan, 2: nan}
})



                       name      type_2     type_3   type_4  type_5 type_6
0                 Silvers Park  Secondary   Nursery  Primary   NaN    NaN
1                  Adare Road     Special      NaN      NaN    NaN    NaN
2                  Cargo Road   Secondary      NaN      NaN    NaN    NaN

所需结果:

我想对上面的df进行转换,因此给出了每条道路的学校类型计数。唯一值(必需的变量)在以下​​数据框列中表示。

例如:

                      name     Secondary     Special   Primary  Nursery 
0                 Silvers Park      1           0           1         1 
1                  Adare Road       0           1           0         0
2                  Cargo Road       1           0           0         0 

谢谢。

熊猫0.23.4

python 3.7.1

2 个答案:

答案 0 :(得分:2)

首先,pivot_table您的数据,然后以u = df.melt('name') u.pivot_table(index='name', columns='value', aggfunc='size', fill_value=0) value Nursery Primary Secondary Special name Adare Road 0 0 0 1 Cargo Road 0 0 1 0 Silvers Park 1 1 1 0 进行旋转:

    using (var scope = host.Services.CreateScope())
    {
       var context = scope.ServiceProvider.GetRequiredService<DatabaseContext>();
       context.Database.EnsureCreated();
    }

答案 1 :(得分:1)

get_dummies与仅删除NaN的列一起使用,如果可能的话,请添加sum

df = (pd.get_dummies(df.set_index('name')
                      .dropna(how='all', axis=1), prefix_sep='',prefix='')
        .sum(axis=1, level=0)
        .reset_index())
print (df)
           name  Secondary  Special  Nursery  Primary
0  Silvers Park          1        0        1        1
1    Adare Road          0        1        0        0
2    Cargo Road          1        0        0        0