我有学校应用程序数据表,看起来像这样......
create table todel (user_id int, SchemesApplicable1 int, SchemesApplicable2 int,
SchemesApplicable3 int, SchemesApplicable4 int);
insert into todel values (1, 1, 0, 1, 0);
insert into todel values (2, 0, 0, 0, 0);
insert into todel values (3, 1, 0, 1, 0);
insert into todel values (4, 1, 0, 0, 0);
insert into todel values (5, 1, 0, 1, 1);
SELECT Count(User_Id) as No_Off_Application ,
sum(if(SchemesApplicable1 = 1, 1, 0)) as first,
sum(if(SchemesApplicable2 = 1, 1, 0)) as second,
sum(if(SchemesApplicable3 = 1, 1, 0)) as third,
sum(if(SchemesApplicable4 = 1, 1, 0)) as forth
FROM todel
以上查询将返回这样的报告......
No_Off_Application first second third forth
5 4 0 3 1
我想再向已申请多个计划的申请人添加一栏。 预期的数量是3(用户ID' s 1,3和5) 我该如何为此编写查询?
答案 0 :(得分:1)
SELECT Count(User_Id) as No_Off_Application ,
sum(SchemesApplicable1) as first,
sum(SchemesApplicable2) as second,
sum(SchemesApplicable3) as third,
sum(SchemesApplicable4) as forth,
sum(SchemesApplicable1 + SchemesApplicable2 + SchemesApplicable3 + SchemesApplicable4 >= 1) as users_at_least_with_one_application
FROM todel
答案 1 :(得分:1)
这是Pandas的设置:
df = pd.DataFrame([[1, 1, 0, 1, 0],
[2, 0, 0, 0, 0,],
[3, 1, 0, 1, 0],
[4, 1, 0, 0, 0],
[5, 1, 0, 1, 1]],
columns=['user_id', 'Scheme1', 'Scheme2', 'Scheme3', 'Scheme4'])
print(df)
user_id Scheme1 Scheme2 Scheme3 Scheme4
0 1 1 0 1 0
1 2 0 0 0 0
2 3 1 0 1 0
3 4 1 0 0 0
4 5 1 0 1 1
使用pandas检查每个用户的方案总数,您可以使用df.sum(axis=1)
:
print(df.iloc[:, 1:].sum(1))
0 2
1 0
2 2
3 1
4 3
dtype: int64
要获得user_ids
,您可以使用布尔索引:
user_id_ser = df.user_id[df.iloc[:, 1:].sum(1) > 1]
print(user_id_ser)
0 1
2 3
4 5
Name: user_id, dtype: int64
要添加“标记/指示符”列,您需要使用> 1
创建掩码并使用df.astype
转换为整数:
df['Schemes > 1'] = (df.iloc[:, 1:].sum(1) > 1).astype(int)
print(df)
user_id Scheme1 Scheme2 Scheme3 Scheme4 Schemes > 1
0 1 1 0 1 0 1
1 2 0 0 0 0 0
2 3 1 0 1 0 1
3 4 1 0 0 0 0
4 5 1 0 1 1 1
最后,要获得准确的输出,您可以使用df.where
:
print(df.where(df > 0).count())
user_id 5
Scheme1 4
Scheme2 0
Scheme3 3
Scheme4 1
Schemes > 1 3
dtype: int64