如何使用sql将两个或多个表连接到一个r中

时间:2013-08-17 11:00:18

标签: sql r join

我已经编写了以下代码,我想将这些表连接到一个大表中;那么如何使用SQL

在R中完成它
user_lessthan10per  <- sqldf("select count(uid) as count_of_students
                       from adopted_user_point
                        where points_scored between 0 and (1469*0.1)")

接下来是

user_lessthan20per  <- sqldf("select count(uid) as count_of_students
                         from adopted_user_point
                         where points_scored >(1469*0.1) and points_scored <= (1469*0.2)")

user_lessthan30per  <- sqldf("select count(uid) as count_of_students
                         from adopted_user_point
                         where points_scored >(1469*0.2) and points_scored <= (1469*0.3)")

现在我想将它加入到一个包含这三个表的count_of_students列的表中。

如何在R中使用UNION命令,但显示错误。

2 个答案:

答案 0 :(得分:2)

您可以使用条件聚合。这将返回一行有三列:

select sum(case when points_scored between 0 and (1469*0.1) then 1 else 0
           end) as cnt1,
       sum(case when points_scored >(1469*0.1) and points_scored <= (1469*0.2) then 1 else 0 
           end) as cnt2,
       sum(case when points_scored >(1469*0.2) and points_scored <= (1469*0.3) then 1 else 0
           end) as cnt3
from adopted_user_point;

如果您想要三行,则可以使用group by聚合:

select (case when points_scored between 0 and (1469*0.1) then 'Group1'
             when points_scored >(1469*0.1) and points_scored <= (1469*0.2) then 'Group2'
             when points_scored >(1469*0.2) and points_scored <= (1469*0.3) then 'Group3'
             else 'Other'
        end) as cnt3, count(*) as count_of_students
from adopted_user_point
group by (case when points_scored between 0 and (1469*0.1) then 'Group1'
               when points_scored >(1469*0.1) and points_scored <= (1469*0.2) then 'Group2'
               when points_scored >(1469*0.2) and points_scored <= (1469*0.3) then 'Group3'
               else 'Other'
          end);

答案 1 :(得分:0)

我会以不同的方式命名原始选择,或许'u_0_10,'u_10_20','u_20_30'以明确“user_less than30per”真的是“user_btwn20_30”,但现在它们是全球环境中的R数据帧,你不需要sdldf将它们组合在一起:

user_under30per <- rbind(user_lessthan10per.
                        user_lessthan20per,
                        user_lessthan30per)

sqldf函数确实提供UNIONs:

 one_and_two <- sqldf("select * from lessthan10per union all 
                                       select * from lessthan20per")
 all_three <- sqldf("select * from one_and_two union all 
                                       select * from lessthan30per")