查询分组的自我加入

时间:2018-09-02 09:17:54

标签: sql group-by hsqldb self-join

给定一组课程,我需要确定参加每个课程的学生所需的房间总数和类型。课程可以并行运行,嵌套运行或重叠运行。

要实现的逻辑:对于每个课程持续时间,找到在该持续时间内所有其他活动课程,以及Number_of_students总数 这些课程中有哪些按room_type分组。
还存在其他问题,但下面给出了该问题的简化版本。

我目前使用hsqldb,该解决方案应使用std sql语法在数据库之间移植。

预订表

BOOKING_ID| COURSE_ID| NUMBER_OF_STUDENTS| ROOM_TYPE_ID
    10    |    2     |        1          |    1
    20    |    1     |        2          |    1
    30    |    3     |        1          |    3
    40    |    1     |        3          |    4
    50    |    5     |        1          |    2
    60    |    6     |        2          |    2
    70    |    3     |        2          |    1
    80    |    4     |        1          |    3

课程表

COURSE_ID| START_DATE |  END_DATE
    1    | 2018-05-15 |  2018-06-14    //sample course
    2    | 2018-05-11 |  2018-05-20    //starts before ends between sample course
    3    | 2018-05-18 |  2018-05-22    //starts between ends between sample course
    4    | 2018-05-20 |  2018-06-20    //starts between ends after sample course
    5    | 2018-05-10 |  2018-06-20    //starts before ends after sample course
    6    | 2018-05-10 |  2018-05-14    //starts and ends before sample course
    7    | 2018-06-15 |  2018-06-20    //starts and ends after sample course

Rooms Table(出于完整性考虑,我们在这里确实不需要此表)

ROOM_TYPE_ID| ROOM_CAPACITY| ROOM_LOCATION
    1       |    1         |  HILL
    2       |    2         |  HILL
    3       |    1         |  OCEAN
    4       |    2         |  OCEAN

输出(仅对course_id 1显示,所有课程均必需)

COURSE_ID | ROOMTYPE | COURSE_STUDENT | OTHER_STUDENTS 
    1     |   1      |        2       |      3           //1(course 2) + 2 (course 3)
    1     |   2      |        0       |      1           //1(course 5)
    1     |   3      |        0       |      2           //1(course 3) + 1(course 4)
    1     |   4      |        3       |      0           //no students on others

我只能找出给定课程 startDate endDate 匹配重叠课程的条件

Courses.START_DATE <= startDate  AND Courses.END_DATE >= endDate    OR        //matches any course spanning current course
Courses.START_DATE >= startDate  AND Courses.END_DATE <= startDate  OR        //matches any course starting during the current course
Courses.START_DATE >= endDate    AND Courses.END_DATE <= endDate              //matches any course ending during the current course

除了我微薄的sql技能之外,我还惨遭失败。我可以旋转一些Java代码来解决此问题。...但是那简直是la脚&&低效。

2 个答案:

答案 0 :(得分:1)

感谢弗雷德(Fredt)向我指出了正确的方向...
将主表及其所有记录均连接到所有记录,然后根据重叠的课程日期条件进行过滤。
下面的查询效率不高,但目前可以完成工作。...可能还有更多优化,我想听听其他意见

SELECT
        THIS_COURSE.COURSE_ID, 
        OTHER_COURSE.ROOM_TYPE_ID,
        SUM(CASE  WHEN THIS_COURSE.BOOKING_ID = OTHER_COURSE.BOOKING_ID THEN OTHER_COURSE.NUMBER_OF_STUDENTS ELSE 0 END) AS COURSE_STUDENTS,
        SUM(CASE  WHEN THIS_COURSE.BOOKING_ID <> OTHER_COURSE.BOOKING_ID THEN OTHER_COURSE.NUMBER_OF_STUDENTS ELSE 0 END) AS OTHER_STUDENTS,
        SUM(OTHER_COURSE.NUMBER_OF_STUDENTS) AS TOTAL_STUDENTS
FROM

(
    SELECT 
        BOOKINGS.BOOKING_ID, 
        BOOKINGS.COURSE_ID, 
        BOOKINGS.NUMBER_OF_STUDENTS, 
        BOOKINGS.ROOM_TYPE_ID, 
        COURSES.START_DATE, 
        COURSES.END_DATE 
    FROM 
        BOOKINGS , COURSES 
    WHERE 
        BOOKINGS.COURSE_ID = COURSES.COURSE_ID
) THIS_COURSE

LEFT JOIN 
(
    SELECT 
        BOOKINGS.BOOKING_ID, 
        BOOKINGS.COURSE_ID, 
        BOOKINGS.NUMBER_OF_STUDENTS, 
        BOOKINGS.ROOM_TYPE_ID, 
        COURSES.START_DATE, 
        COURSES.END_DATE 
    FROM 
        BOOKINGS , COURSES 
    WHERE 
        BOOKINGS.COURSE_ID = COURSES.COURSE_ID
) OTHER_COURSE

ON 
    THIS_COURSE.BOOKING_ID <> OTHER_COURSE.BOOKING_ID OR
    THIS_COURSE.BOOKING_ID = OTHER_COURSE.BOOKING_ID

WHERE
        (THIS_COURSE.START_DATE <= OTHER_COURSE.START_DATE AND THIS_COURSE.END_DATE >= OTHER_COURSE.END_DATE)  OR
        (THIS_COURSE.START_DATE <= OTHER_COURSE.START_DATE AND THIS_COURSE.END_DATE >= OTHER_COURSE.START_DATE)  OR
        (THIS_COURSE.START_DATE <= OTHER_COURSE.END_DATE   AND THIS_COURSE.END_DATE >= OTHER_COURSE.END_DATE)  

GROUP BY 
    THIS_COURSE.COURSE_ID, OTHER_COURSE.ROOM_TYPE_ID

下面是创建示例数据的sql

CREATE TABLE Bookings(BOOKING_ID INTEGER NOT NULL PRIMARY KEY, COURSE_ID INTEGER NOT NULL, NUMBER_OF_STUDENTS INTEGER NOT NULL, ROOM_TYPE_ID INTEGER NOT NULL)
CREATE TABLE Courses(COURSE_ID INTEGER NOT NULL PRIMARY KEY, START_DATE DATE,  END_DATE  DATE)
CREATE TABLE Rooms(ROOM_TYPE_ID INTEGER NOT NULL PRIMARY KEY, ROOM_CAPACITY INTEGER NOT NULL, ROOM_LOCATION VARCHAR(25))

INSERT INTO Bookings VALUES(    10   ,    2    ,        1         ,    1 )
INSERT INTO Bookings VALUES(    20   ,    1    ,        2         ,    1 )
INSERT INTO Bookings VALUES(    30   ,    3    ,        1         ,    3 )
INSERT INTO Bookings VALUES(    40   ,    1    ,        3         ,    4 )
INSERT INTO Bookings VALUES(    50   ,    5    ,        1         ,    2 )
INSERT INTO Bookings VALUES(    60   ,    6    ,        2         ,    2 )
INSERT INTO Bookings VALUES(    70   ,    3    ,        2         ,    1 )
INSERT INTO Bookings VALUES(    80   ,    4    ,        1         ,    3 )
INSERT INTO Bookings VALUES(    90   ,    7    ,        1         ,    4 )


INSERT INTO Courses VALUES(    1    ,'2018-05-15', '2018-06-14' )
INSERT INTO Courses VALUES(    2    ,'2018-05-11', '2018-05-20' )
INSERT INTO Courses VALUES(    3    ,'2018-05-18', '2018-05-22' )
INSERT INTO Courses VALUES(    4    ,'2018-05-20', '2018-06-20' )
INSERT INTO Courses VALUES(    5    ,'2018-05-10', '2018-06-20' )
INSERT INTO Courses VALUES(    6    ,'2018-05-10', '2018-05-14' )
INSERT INTO Courses VALUES(    7    ,'2018-06-15', '2018-06-20' )


INSERT INTO Rooms VALUES(    1       ,    1        ,  'HILL')
INSERT INTO Rooms VALUES(    2       ,    2        ,  'HILL')
INSERT INTO Rooms VALUES(    3       ,    1        ,  'OCEAN')
INSERT INTO Rooms VALUES(    4       ,    2        ,  'OCEAN')

答案 1 :(得分:0)

您实际上想要每种课程每种房间所需的房间数量。因此,您需要从COURSES表开始并将其与其他两个表连接起来。

SELECT * FROM COURSES JOIN BOOKINGS USING (COURSE_ID) JOIN ROOMS USING (ROOM_TYPE_ID)

这为您提供了所有客房预订的一长串清单。然后,您可以将此表视为子查询表,并根据日期时间段将其自身联接起来。

WITH ROOM_BOOKINGS AS (
  SELECT 
    BOOKINGS.BOOKING_ID, 
    BOOKINGS.COURSE_ID, 
    BOOKINGS.NUMBER_OF_STUDENTS, 
    BOOKINGS.ROOM_TYPE_ID, 
    COURSES.START_DATE, 
    COURSES.END_DATE, 
    ROOMS.ROOM_CAPACITY
  FROM 
    COURSES JOIN BOOKINGS USING (COURSE_ID) JOIN ROOMS USING (ROOM_TYPE_ID)
 ) 
 SELECT * FROM ROOM_BOOKINGS THIS_COURSE LEFT JOIN ROOM_BOOKINGS OTHER_COURSE
 ON (THIS_COURSE.START_DATE, THIS_COURSE.END_DATE + 1 DAY) OVERLAPS (OTHER_COURSE.START_DATE, OTHER_COURSE.END_DATE + 1 DAY)
 AND THIS_COURSE.ROOM_TYPE_ID = OTHER_COURSE.ROOM_TYPE_ID 
 AND THIS_COURSE.COURSE_ID  <> OTHER_COURSE.COURSE_ID

您需要完成上述查询并将条件添加到SELECT中,以仅返回一门课程。您还需要GROUP BY A.COURSE_ID,A.ROOM_TYPE_ID,A.NUMBER_OF_STUDENTS,...和SUM(B.NUMBER_OF_STUDENS)才能获得所需的输出。

如您所见,编写高级SQL查询并不是一项繁琐的任务,并且需要对SQL语言有充分的了解。