可以优化查询:获取记录的最大日期,然后加入最大日期的值

时间:2017-03-02 19:49:40

标签: mysql database

我创建了一个返回我想要的结果的查询,但我觉得必须有更好的方法来做到这一点。任何指导都将不胜感激。

我正在尝试获取特定会议的所有项目并加入他们的最大会议日期< X并加入最大日期的委员会首字母缩略词。 X是当前的会议日期。

我尝试过几个不同的查询,但除了下面的查询之外,没有任何查询一直返回预期的结果。

您可以转到rextester来查看此查询。

DROP TABLE IF EXISTS `committees`;
CREATE TABLE committees
    (`id` int, `acronym` varchar(4))
;

INSERT INTO committees
    (`id`, `acronym`)
VALUES
    (1, 'Com1'),
    (2, 'Com2'),
    (3, 'Com3')
;

DROP TABLE IF EXISTS `meetings`;
CREATE TABLE meetings
    (`id` int, `date` datetime, `committee_id` int)
;

INSERT INTO meetings
    (`id`, `date`, `committee_id`)
VALUES
    (1, '2017-01-01 00:00:00', 1),
    (2, '2017-02-02 00:00:00', 2),
    (3, '2017-03-03 00:00:00', 2)
;

DROP TABLE IF EXISTS `agenda_items`;
CREATE TABLE agenda_items
    (`id` int, `name` varchar(6))
;

INSERT INTO agenda_items
    (`id`, `name`)
VALUES
    (1, 'Item 1'),
    (2, 'Item 2'),
    (3, 'Item 3')
;

DROP TABLE IF EXISTS `join_agenda_items_meetings`;
CREATE TABLE join_agenda_items_meetings
    (`id` int, `agenda_item_id` int, `meeting_id` int)
;

INSERT INTO join_agenda_items_meetings
    (`id`, `agenda_item_id`, `meeting_id`)
VALUES
    (1, 1, 1),
    (2, 1, 2),
    (3, 2, 1),
    (4, 3, 2),
    (5, 2, 1),
    (6, 1, 3)
;




SELECT agenda_items.id, 
       meetings.id, 
       meetings.date, 
       sub_one.max_date, 
       sub_two.acronym 
FROM   agenda_items 
       LEFT JOIN (SELECT ai.id                AS ai_id, 
                         me.id                AS me_id, 
                         Max(me.date) AS max_date 
                  FROM   agenda_items AS ai 
                         JOIN join_agenda_items_meetings AS jaim 
                           ON jaim.agenda_item_id = ai.id 
                         JOIN meetings AS me 
                           ON me.id = jaim.meeting_id 
                  WHERE  me.date < '2017-02-02' 
                  GROUP  BY ai_id) sub_one 
              ON sub_one.ai_id = agenda_items.id 
       LEFT JOIN (SELECT agenda_items.id       AS age_id, 
                         meetings.date AS meet_date, 
                         committees.acronym    AS acronym 
                  FROM   agenda_items 
                         JOIN join_agenda_items_meetings 
                           ON join_agenda_items_meetings.agenda_item_id = agenda_items.id 
                         JOIN meetings 
                           ON meetings.id = join_agenda_items_meetings.meeting_id 
                         JOIN committees 
                           ON committees.id = meetings.committee_id 
                  WHERE  meetings.date) sub_two 
              ON sub_two.age_id = agenda_items.id 
                 AND sub_one.max_date = sub_two.meet_date 
       JOIN join_agenda_items_meetings 
         ON agenda_items.id = join_agenda_items_meetings.agenda_item_id 
       JOIN meetings 
         ON meetings.id = join_agenda_items_meetings.meeting_id 
WHERE  meetings.id = 2;

审核/测试答案(已修订):*

我已根据所做的评论修改了测试。

由于我对这个问题给予了赏金,我觉得我应该展示我是如何评估答案并提供一些反馈的。总的来说,我非常感谢所有帮助,谢谢。

为了进行测试,我查看了以下问题:

我的原始查询与EXPLAIN

+----+-------------+---------------------------+------+----------------------------------------------+
| id | select_type | table                     | rows | Extra                                        |
+----+-------------+---------------------------+------+----------------------------------------------+
|  1 | PRIMARY     | meetings                  |    1 |                                              |
|  1 | PRIMARY     | join_agenda_item_meetings | 1976 | Using where; Using index                     |
|  1 | PRIMARY     | agenda_items              |    1 | Using index                                  |
|  1 | PRIMARY     | <derived2>                | 1087 |                                              |
|  1 | PRIMARY     | <derived3>                | 2202 |                                              |
|  3 | DERIVED     | join_agenda_item_meetings | 1976 | Using index                                  |
|  3 | DERIVED     | meetings                  |    1 | Using where                                  |
|  3 | DERIVED     | committees                |    1 |                                              |
|  3 | DERIVED     | agenda_items              |    1 | Using index                                  |
|  2 | DERIVED     | jaim                      | 1976 | Using index; Using temporary; Using filesort |
|  2 | DERIVED     | me                        |    1 | Using where                                  |
|  2 | DERIVED     | ai                        |    1 | Using index                                  |
+----+-------------+---------------------------+------+----------------------------------------------+
12 rows in set (0.02 sec)

Paul Spiegel的答案。

初始答案有效,似乎是最有效的选项,远远超过我的。

Paul Spiegel的第一个查询提取的行数最少,比我的更短,更易读。它也不需要引用一个更好的日期来编写它。

+----+--------------------+-------+------+--------------------------+
| id | select_type        | table | rows | Extra                    |
+----+--------------------+-------+------+--------------------------+
|  1 | PRIMARY            | m1    |    1 |                          |
|  1 | PRIMARY            | am1   | 1976 | Using where; Using index |
|  1 | PRIMARY            | am2   |    1 | Using index              |
|  1 | PRIMARY            | m2    |    1 |                          |
|  2 | DEPENDENT SUBQUERY | am3   |    1 | Using index              |
|  2 | DEPENDENT SUBQUERY | m3    |    1 | Using where              |
|  2 | DEPENDENT SUBQUERY | c3    |    1 | Using where              |
+----+--------------------+-------+------+--------------------------+
7 rows in set (0.00 sec)

DISTINCT添加到select语句时,此查询也会返回正确的结果。这个查询的效果不如第一个(但它很接近)。

+----+-------------+------------++------+-------------------------+
| id | select_type | table      | rows | Extra                    |
+----+-------------+------------++------+-------------------------+
|  1 | PRIMARY     | <derived2> |    5 | Using temporary          |
|  1 | PRIMARY     | am         |    1 | Using index              |
|  1 | PRIMARY     | m          |    1 |                          |
|  1 | PRIMARY     | c          |    1 | Using where              |
|  2 | DERIVED     | m1         |    1 |                          |
|  2 | DERIVED     | am1        | 1787 | Using where; Using index |
|  2 | DERIVED     | am2        |    1 | Using index              |
|  2 | DERIVED     | m2         |    1 |                          |
+----+-------------+------------+------+--------------------------+
8 rows in set (0.00 sec)

Stefano Zanini的回答

此查询确实使用DISTINCT返回预期结果。当使用EXPLAIN和被拉动的行数时,与原始查询相比,此查询更有效,但Paul Spiegel的更好一点。

+----+-------------+------------+------+---------------------------------+
| id | select_type | table      | rows | Extra                           |
+----+-------------+------------+------+---------------------------------+
|  1 | PRIMARY     | me         |    1 | Using temporary; Using filesort |
|  1 | PRIMARY     | rel        | 1787 | Using where; Using index        |
|  1 | PRIMARY     | <derived2> | 1087 |                                 |
|  1 | PRIMARY     | rel2       |    1 | Using index                     |
|  1 | PRIMARY     | me2        |    1 | Using where                     |
|  1 | PRIMARY     | co         |    1 |                                 |
|  2 | DERIVED     | t1         | 1787 | Using index                     |
|  2 | DERIVED     | t2         |    1 | Using where                     |
+----+-------------+------------+------+---------------------------------+
8 rows in set (0.00 sec)

EoinS'回答

如评论中所述,如果会议是连续的,这个答案是有效的,但不幸的是,它们可能并不存在。

3 个答案:

答案 0 :(得分:5)

这个有点疯狂..让我们一步一步地做:

第一步是基本联接

set @meeting_id = 2;

select am1.meeting_id,
       am1.agenda_item_id,
       m1.date as meeting_date
from meetings m1
join join_agenda_items_meetings am1 on am1.meeting_id = m1.id
where m1.id = @meeting_id;

我们选择会议(id = 2)和相应的agenda_item_ids。这将返回前三列所需的行。

下一步是获取每个议程项目的最后一次会议日期。我们需要将第一个查询与连接表和相应的会议连接起来(id = 2 - am2.meeting_id <> am1.meeting_id除外)。我们只希望会议在实际会议之前有一个日期(m2.date < m1.date)。在所有这些会议中,我们只想要每个议程项目的最新日期。因此,我们按议程项目进行分组,然后选择max(m2.date)

select am1.meeting_id,
       am1.agenda_item_id,
       m1.date as meeting_date,
       max(m2.date) as max_date
from meetings m1
join join_agenda_items_meetings am1 on am1.meeting_id = m1.id
left join join_agenda_items_meetings am2 
    on  am2.agenda_item_id = am1.agenda_item_id
    and am2.meeting_id <> am1.meeting_id
left join meetings m2 
    on  m2.id = am2.meeting_id
    and m2.date < m1.date
where m1.id = @meeting_id
group by m1.id, am1.agenda_item_id;

这样我们得到第四列(max_date)。

最后一步是选择会议的acronym和最后一个日期(max_date)。这是一个疯狂的部分 - 我们可以在SELECT子句中使用相关的子查询。我们可以使用max(m2.date)进行关联:

select c3.acronym
from meetings m3
join join_agenda_items_meetings am3 on am3.meeting_id = m3.id
join committees c3 on c3.id = m3.committee_id
where am3.agenda_item_id = am2.agenda_item_id
  and m3.date = max(m2.date)

最终查询将是:

select am1.meeting_id,
       am1.agenda_item_id,
       m1.date as meeting_date,
       max(m2.date) as max_date,
       (   select c3.acronym
           from meetings m3
           join join_agenda_items_meetings am3 on am3.meeting_id = m3.id
           join committees c3 on c3.id = m3.committee_id
           where am3.agenda_item_id = am2.agenda_item_id
             and m3.date = max(m2.date)
       ) as acronym
from meetings m1
join join_agenda_items_meetings am1 on am1.meeting_id = m1.id
left join join_agenda_items_meetings am2 
    on  am2.agenda_item_id = am1.agenda_item_id
    and am2.meeting_id <> am1.meeting_id
left join meetings m2 
    on  m2.id = am2.meeting_id
    and m2.date < m1.date
where m1.id = @meeting_id
group by m1.id, am1.agenda_item_id;

http://rextester.com/JKK60222

说实话,我很惊讶您可以在子查询中使用max(m2.date)

另一种解决方案 - 在子查询(派生表)中使用第二个查询。使用max_date加入会议和联接表的委员会。仅保留带有首字母缩写词和行而没有max_date的行。

select t.*, c.acronym
from (
    select am1.meeting_id,
           am1.agenda_item_id,
           m1.date as meeting_date,
           max(m2.date) as max_date
    from meetings m1
    join join_agenda_items_meetings am1 on am1.meeting_id = m1.id
    left join join_agenda_items_meetings am2 
        on  am2.agenda_item_id = am1.agenda_item_id
        and am2.meeting_id <> am1.meeting_id
    left join meetings m2 
        on  m2.id = am2.meeting_id
        and m2.date < m1.date
    where m1.id = @meeting_id
    group by m1.id, am1.agenda_item_id
) t
left join join_agenda_items_meetings am
    on  am.agenda_item_id = t.agenda_item_id
    and t.max_date is not null
left join meetings m
    on  m.id   = am.meeting_id
    and m.date = t.max_date
left join committees c on c.id = m.committee_id
where t.max_date is null or c.acronym is not null;

http://rextester.com/BBMDFL23101

答案 1 :(得分:3)

Using your schema I used the below query, assuming that all meetings entries are sequential:

 set @mymeeting = 2;

 select j.agenda_item_id, m.id, m.date, mp.date, c.acronym
 from meetings m 
 left join join_agenda_items_meetings j on j.meeting_id = m.id
 left join join_agenda_items_meetings jp on jp.meeting_id = m.id -1 and jp.agenda_item_id = j.agenda_item_id
 left join meetings mp on mp.id = jp.meeting_id
 left join committees c on mp.committee_id = c.id
 where m.id = @mymeeting;

I create a variable just to make it easy to change meetings on the fly.

Here is a functional example in Rextester

Thanks for making your schema so easy to reproduce!

答案 2 :(得分:3)

我发现这个问题非常具有挑战性,我所取得的成果并不令人惊讶,但我设法摆脱了其中一个子查询,可能还有一些联接,这就是结果:

select    distinct me.ID, me.DATE, rel.AGENDA_ITEM_ID, sub.MAX_DATE, co.ACRONYM
from      MEETINGS me
join      JOIN_AGENDA_ITEMS_MEETINGS rel /* Note 1*/
  on      me.ID = rel.MEETING_ID
left join (   
              select  t1.AGENDA_ITEM_ID, max(t2.DATE) MAX_DATE
              from    JOIN_AGENDA_ITEMS_MEETINGS t1
              join    MEETINGS t2
                on    t2.ID = t1.MEETING_ID
              where   t2.DATE < '2017-02-02'
              group by t1.AGENDA_ITEM_ID
          ) sub
  on      rel.AGENDA_ITEM_ID = sub.AGENDA_ITEM_ID /* Note 2 */
left join JOIN_AGENDA_ITEMS_MEETINGS rel2
  on      rel2.AGENDA_ITEM_ID = rel.AGENDA_ITEM_ID /* Note 3 */
left join MEETINGS me2
  on      rel2.MEETING_ID = me2.ID and
          sub.MAX_DATE = me2.DATE /* Note 4 */
left join COMMITTEES co
  on      co.ID = me2.COMMITTEE_ID
where     me.ID = 2 and
          (sub.MAX_DATE is null or me2.DATE is not null) /* Note 5 */
order by  rel.AGENDA_ITEM_ID, rel2.MEETING_ID;

备注

  1. 您不需要加入AGENDA_ITEMS,因为ID已在关系表中提供

  2. 到目前为止,我们有当前的会议,议程项目和他们的&#34;计算&#34;最长日期

  3. 我们会收到每个议程项目的所有会议......

  4. ...以便我们可以选择与我们之前计算的最长日期相匹配的会议

  5. 此条件是必需的,因为必须保留rel2所有的联接(因为某些议程项目可能没有以前的会议,因此MAX_DATE = null),但这样{{1}会给一些议程项目提出不受欢迎的会议。