我已经编写了以下代码,我想将这些表连接到一个大表中;那么如何使用SQL
在R中完成它user_lessthan10per <- sqldf("select count(uid) as count_of_students
from adopted_user_point
where points_scored between 0 and (1469*0.1)")
接下来是
user_lessthan20per <- sqldf("select count(uid) as count_of_students
from adopted_user_point
where points_scored >(1469*0.1) and points_scored <= (1469*0.2)")
,
user_lessthan30per <- sqldf("select count(uid) as count_of_students
from adopted_user_point
where points_scored >(1469*0.2) and points_scored <= (1469*0.3)")
现在我想将它加入到一个包含这三个表的count_of_students列的表中。
如何在R中使用UNION命令,但显示错误。
答案 0 :(得分:2)
您可以使用条件聚合。这将返回一行有三列:
select sum(case when points_scored between 0 and (1469*0.1) then 1 else 0
end) as cnt1,
sum(case when points_scored >(1469*0.1) and points_scored <= (1469*0.2) then 1 else 0
end) as cnt2,
sum(case when points_scored >(1469*0.2) and points_scored <= (1469*0.3) then 1 else 0
end) as cnt3
from adopted_user_point;
如果您想要三行,则可以使用group by
聚合:
select (case when points_scored between 0 and (1469*0.1) then 'Group1'
when points_scored >(1469*0.1) and points_scored <= (1469*0.2) then 'Group2'
when points_scored >(1469*0.2) and points_scored <= (1469*0.3) then 'Group3'
else 'Other'
end) as cnt3, count(*) as count_of_students
from adopted_user_point
group by (case when points_scored between 0 and (1469*0.1) then 'Group1'
when points_scored >(1469*0.1) and points_scored <= (1469*0.2) then 'Group2'
when points_scored >(1469*0.2) and points_scored <= (1469*0.3) then 'Group3'
else 'Other'
end);
答案 1 :(得分:0)
我会以不同的方式命名原始选择,或许'u_0_10,'u_10_20','u_20_30'以明确“user_less than30per”真的是“user_btwn20_30”,但现在它们是全球环境中的R数据帧,你不需要sdldf
将它们组合在一起:
user_under30per <- rbind(user_lessthan10per.
user_lessthan20per,
user_lessthan30per)
sqldf函数确实提供UNIONs:
one_and_two <- sqldf("select * from lessthan10per union all
select * from lessthan20per")
all_three <- sqldf("select * from one_and_two union all
select * from lessthan30per")