仅包含查询结果中的第一个和最后一个组

时间:2015-07-13 19:17:57

标签: mysql sql

给定架构

enter image description here

以下查询

SELECT a.user_id,
  a.id,
  a.date_created,
  avg(ai.level) level
FROM assessment a
  JOIN assessment_item ai ON a.id = ai.assessment_id
GROUP BY a.user_id, a.id;

返回这些结果

user_id, a.id, a.date_created,        level
1,       99,   "2015-07-13 18:26:00", 4.0000  
1,       98,   "2015-07-13 19:04:58", 6.0000  
13,      9,    "2015-07-13 18:26:00", 2.0000  
13,      11,   "2015-07-13 19:04:58", 3.0000  

我想更改查询,以便只为每个用户返回最早的结果。换句话说,应该返回以下内容

user_id, a.id, a.date_created,        level
1,       99,   "2015-07-13 18:26:00", 4.0000
13,      9,    "2015-07-13 18:26:00", 2.0000

我想我需要添加HAVING条款,但我很难弄清楚确切的语法。

2 个答案:

答案 0 :(得分:0)

我做过类似的事情,除了我想要的每个小组前5个小差异。用例用于报告 - 表示运行查询/创建临时表的时间不是约束。

我的解决方案:

  • 创建一个新列,其中列为id(对原始表的引用),id可以是unique / primary
  • INSERT IGNORE INTO tbl1(id)从original_tbl中选择min(id),其中id不在(从tbl1中选择id)group by user_id
  • 您需要多次重复步骤2(在我的情况下,它是5次)。新表格表只包含您要显示的ID
  • 现在在tbl1上运行连接,原始表将为您提供所需的结果

注意:这可能不是最好的解决方案,但是当我不得不在周末的2-3小时内分享报告时,这对我有用。我的数据大小约为1M记录

答案 1 :(得分:0)

Disclaimer: I am in a bit of a hurry, and have not tested this fully

-- Create a CTE that holds the first and last date for each user_id.
with first_and_last as (
    -- Get the first date (min) for each user_id
    select a.[user_id], min(a.date_created) as date_created
    from assessment as a
    group by a.[user_id]

    -- Combine the first and last, so each user_id should have two entries, even if they are the same one.
    union all

    -- Get the last date (max) for each user_id
    select a.[user_id], max(a.date_created)
    from assessment as a
    group by a.[user_id]
)
select a.[user_id],
        a.id,
        a.date_created,
        avg(ai.[level]) as [level]
from assessment as a
    inner join assessment_item as ai on a.id = ai.assessment_id
    -- Join with the CTE to only keep records that have either the min or max date_created for each user_id.
    inner join first_and_last as fnl on a.[user_id] = fnl.[user_id] and a.date_created = fnl.date_created
group by a.[user_id], a.id, a.date_created;