Question

我的问题与MySQL: Select rows with more than one occurrence类似，但我使用的是PostgreSQL。我有一个查询，例如：

select d.user_id, d.recorded_at, d.glucose_value, d.unit
from diary as d
join (
    select u.id
    from health_user as u
    join (
        select distinct user_id
        from care_connect
        where clinic_id = 217
            and role = 'user'
            and status = 'active'
    ) as c
    on u.id = c.user_id
    where u.is_tester is false
) as cu
on d.user_id = cu.id
where d.created_at >= d.recorded_at
    and d.recorded_at < current_date and d.recorded_at >= current_date - interval '30 days'
    and d.glucose_value > 0
    and (d.state = 'wakeup' or (d.state = 'before_meal' and d.meal_type = 'breakfast'))

结果如下：

+---------+---------------------+---------------+--------+
| user_id |     recorded_at     | glucose_value |  unit  |
+---------+---------------------+---------------+--------+
|   12041 | 2018-06-26 01:10:12 |           100 | mg/dL  |
|   12041 | 2018-06-30 02:10:11 |            90 | mg/dL  |
|   12214 | 2018-06-25 12:40:13 |            10 | mmol/L |
|   12214 | 2018-06-26 12:41:13 |            12 | mmol/L |
|   12214 | 2018-06-29 00:21:14 |            11 | mmol/L |
|   12214 | 2018-06-29 12:59:32 |            10 | mmol/L |
+---------+---------------------+---------------+--------+

如您所见，在许多情况下，这已经是一个漫长的查询。现在，我只想获取来自结果中不少于四个记录（行）的用户的记录，所以我尝试了：

select d.user_id, d.recorded_at, d.glucose_value, d.unit, count(d.*)
from diary as d
join (
    select u.id
    from health_user as u
    join (
        select distinct user_id
        from care_connect
        where clinic_id = 217
            and role = 'user'
            and status = 'active'
    ) as c
    on u.id = c.user_id
    where u.is_tester is false
) as cu
on d.user_id = cu.id
where d.created_at >= d.recorded_at
    and d.recorded_at < current_date and d.recorded_at >= current_date - interval '30 days'
    and d.glucose_value > 0
    and (d.state = 'wakeup' or (d.state = 'before_meal' and d.meal_type = 'breakfast'))
group by d.user_id
having count(d.*) >= 4

我的预期输出是：

+---------+---------------------+---------------+--------+
| user_id |     recorded_at     | glucose_value |  unit  |
+---------+---------------------+---------------+--------+
|   12214 | 2018-06-25 12:40:13 |            10 | mmol/L |
|   12214 | 2018-06-26 12:41:13 |            12 | mmol/L |
|   12214 | 2018-06-29 00:21:14 |            11 | mmol/L |
|   12214 | 2018-06-29 12:59:32 |            10 | mmol/L |
+---------+---------------------+---------------+--------+

但是，它抛出一个错误，说d.recorded_at也应该添加到group by中，但这不是我想要的。除了将原始时间戳分组之外，没有任何意义。

我知道我可能可以联接另一个表，该表是由同一查询生成的，但第一行只有select d.user_id, count(d.*)，但是整个查询看起来会很疯狂。

请有人帮我如何更好地实现这一目标？抱歉，我没有在这里放置表结构，但是如果需要，我可以进行编辑和澄清。

Answer 1

尝试一下

Select user_id, recorded_at, glucose_value, unit
From (
select d.user_id, d.recorded_at, d.glucose_value, d.unit, count(1) over (partition by d.user_id) rcnt
from diary as d
join (
    select u.id
    from health_user as u
    join (
        select distinct user_id
        from care_connect
        where clinic_id = 217
            and role = 'user'
            and status = 'active'
    ) as c
    on u.id = c.user_id
    where u.is_tester is false
) as cu
on d.user_id = cu.id
where d.created_at >= d.recorded_at
    and d.recorded_at < current_date and d.recorded_at >= current_date - interval '30 days'
    and d.glucose_value > 0
    and (d.state = 'wakeup' or (d.state = 'before_meal' and d.meal_type = 'breakfast'))
) x 
Where rcnt >= 4

Answer 2

尝试一下：

将your_query替换为您的实际查询。

使用 with子句和 exists子句。

with original_query as ( your_query )
select * from original_query q1
where 
exists( select q2.user_id from original_query q2 where q1.user_id = q2.user_id
group by q2.user_id 
having count(q2.user_id) >= 4 )

从联接表中选择出现次数超过n的行

2 个答案: