我有以下数据框:
df = pd.DataFrame({
'name': {0: 'Silvers Park', 1: 'Adare Road', 2: 'Cargo Road'},
'type_2': {0: 'Secondary', 1: 'Special', 2: 'Secondary'},
'type_3': {0: 'Nursery', 1: nan, 2: nan},
'type_4': {0: 'Primary', 1: nan, 2: nan},
'type_5': {0: nan, 1: nan, 2: nan},
'type_6': {0: nan, 1: nan, 2: nan}
})
name type_2 type_3 type_4 type_5 type_6
0 Silvers Park Secondary Nursery Primary NaN NaN
1 Adare Road Special NaN NaN NaN NaN
2 Cargo Road Secondary NaN NaN NaN NaN
所需结果:
我想对上面的df进行转换,因此给出了每条道路的学校类型计数。唯一值(必需的变量)在以下数据框列中表示。
例如:
name Secondary Special Primary Nursery
0 Silvers Park 1 0 1 1
1 Adare Road 0 1 0 0
2 Cargo Road 1 0 0 0
谢谢。
熊猫0.23.4
python 3.7.1
答案 0 :(得分:2)
首先,pivot_table
您的数据,然后以u = df.melt('name')
u.pivot_table(index='name', columns='value', aggfunc='size', fill_value=0)
value Nursery Primary Secondary Special
name
Adare Road 0 0 0 1
Cargo Road 0 0 1 0
Silvers Park 1 1 1 0
进行旋转:
using (var scope = host.Services.CreateScope())
{
var context = scope.ServiceProvider.GetRequiredService<DatabaseContext>();
context.Database.EnsureCreated();
}
答案 1 :(得分:1)
将get_dummies
与仅删除NaN
的列一起使用,如果可能的话,请添加sum
:
df = (pd.get_dummies(df.set_index('name')
.dropna(how='all', axis=1), prefix_sep='',prefix='')
.sum(axis=1, level=0)
.reset_index())
print (df)
name Secondary Special Nursery Primary
0 Silvers Park 1 0 1 1
1 Adare Road 0 1 0 0
2 Cargo Road 1 0 0 0