Question

我已经编写了以下代码，我想将这些表连接到一个大表中;那么如何使用SQL

在R中完成它

user_lessthan10per  <- sqldf("select count(uid) as count_of_students
                       from adopted_user_point
                        where points_scored between 0 and (1469*0.1)")

接下来是

user_lessthan20per  <- sqldf("select count(uid) as count_of_students
                         from adopted_user_point
                         where points_scored >(1469*0.1) and points_scored <= (1469*0.2)")

，

user_lessthan30per  <- sqldf("select count(uid) as count_of_students
                         from adopted_user_point
                         where points_scored >(1469*0.2) and points_scored <= (1469*0.3)")

现在我想将它加入到一个包含这三个表的count_of_students列的表中。

如何在R中使用UNION命令，但显示错误。

Answer 1

您可以使用条件聚合。这将返回一行有三列：

select sum(case when points_scored between 0 and (1469*0.1) then 1 else 0
           end) as cnt1,
       sum(case when points_scored >(1469*0.1) and points_scored <= (1469*0.2) then 1 else 0 
           end) as cnt2,
       sum(case when points_scored >(1469*0.2) and points_scored <= (1469*0.3) then 1 else 0
           end) as cnt3
from adopted_user_point;

如果您想要三行，则可以使用group by聚合：

select (case when points_scored between 0 and (1469*0.1) then 'Group1'
             when points_scored >(1469*0.1) and points_scored <= (1469*0.2) then 'Group2'
             when points_scored >(1469*0.2) and points_scored <= (1469*0.3) then 'Group3'
             else 'Other'
        end) as cnt3, count(*) as count_of_students
from adopted_user_point
group by (case when points_scored between 0 and (1469*0.1) then 'Group1'
               when points_scored >(1469*0.1) and points_scored <= (1469*0.2) then 'Group2'
               when points_scored >(1469*0.2) and points_scored <= (1469*0.3) then 'Group3'
               else 'Other'
          end);

Answer 2

我会以不同的方式命名原始选择，或许'u_0_10，'u_10_20'，'u_20_30'以明确“user_less than30per”真的是“user_btwn20_30”，但现在它们是全球环境中的R数据帧，你不需要sdldf将它们组合在一起：

user_under30per <- rbind(user_lessthan10per.
                        user_lessthan20per,
                        user_lessthan30per)

sqldf函数确实提供UNIONs：

 one_and_two <- sqldf("select * from lessthan10per union all 
                                       select * from lessthan20per")
 all_three <- sqldf("select * from one_and_two union all 
                                       select * from lessthan30per")

如何使用sql将两个或多个表连接到一个r中

2 个答案: