Postgres,每天从日期范围选择中获取唯一记录

时间:2019-01-16 11:42:55

标签: json postgresql distinct-values

我需要按日期范围与登录的用户进行报告,但在同一天没有重复(如果某人在同一天被登录两次,我们将不会列出两次)。不幸的是,我们将登录信息保留为json(是的,我无法将其更改为单独的表,我不知道是谁设计了这个数据库)。 查询以查看所有登录的用户:

select a.id, username, email, ah.modified as login_date
from accounts a join
     account_history ah
     on modified_acc_id = a.id
 where ah.data::jsonb->>'message' = 'Logon';

修改为带时区的时间戳,并用作登录日期。

我只发现了每天计数不重复ID的示例,但我不知道如何修改它以每天返回不重复的结果

样本数据:

 id  |        username  |              email       |         login_date
-----+-------------------------+---------------------------------+----------------------------
 102 | example          | example@example.com      | 2018-12-06 09:30:10.573+00
 102 | example          | example@example.com      | 2018-12-06 09:32:34.235+00
  42 | rafal            | rafal@example.com        | 2018-12-06 09:45:24.884+00
 576 | john             | john@example.com         | 2018-12-06 09:35:24.922+00
 576 | john             | john@example.com         | 2018-12-07 09:58:04.253+00

想要的数据:

 id  |        username  |              email       |         login_date
-----+-------------------------+---------------------------------+----------------------------
 102 | example          | example@example.com      | 2018-12-06 09:30:10.573+00
  42 | rafal            | rafal@example.com        | 2018-12-06 09:45:24.884+00
 576 | john             | john@example.com         | 2018-12-06 09:35:24.922+00
 576 | john             | john@example.com         | 2018-12-07 09:58:04.253+00

如您所见,没有第二行

4 个答案:

答案 0 :(得分:2)

DISTINCT ON恰好为您提供了有序组的第一行。在您的示例中,该组是id时间戳记的datelogin_date部分

SELECT DISTINCT ON (id, login_date::date)
    *
FROM (
    -- <your query>
) s
ORDER BY id, login_date::date, login_date

demo:db<>fiddle

ORDER BY子句的说明:

您必须先按DISTINCT列进行订购。但是在您的情况下,您实际上并不想只按日期订购,也不想按时间订购。因此,在按日期排序(由于您的DISTINCT列而必须进行排序)之后,您还必须按时间戳进行排序。


因此整个查询可以简化为(没有子查询):

SELECT DISTINCT ON (a.id, ah.modified::date) 
    a.id, 
    username, 
    email, 
    ah.modified as login_date
FROM accounts a 
JOIN account_history ah
    ON modified_acc_id = a.id
WHERE ah.data::jsonb->>'message' = 'Logon'
ORDER BY a.id, ah.modified::date, ah.modified 

答案 1 :(得分:0)

您似乎想要一段时间的用户天数。如果我理解正确:

select count(*) as num_user_days_in_range
from (select a.username, date_trunc('day', ah.modified) as login_date
      from accounts a join
           account_history ah
           on modified_acc_id = a.id
      where ah.data::jsonb->>'message' = 'Logon'
      group by a.username, login_date
     ) u
where login_date >= $date1 and login_date < $date2

答案 2 :(得分:0)

使用窗口功能row_number()

select id,username,email,login_date from 
(
 select a.id, username, email, ah.modified as login_date,
row_number() over(partition by a.id, username,email order by ah.modified) rn
 from accounts a join
 account_history ah
 on modified_acc_id = a.id
 where ah.data::jsonb->>'message' = 'Logon'
) t where t.rn=1

答案 3 :(得分:0)

好像有一个骗子一样,您正在最早的约会。如果是这样,这行得通吗?

select
  a.id, username, email, min (ah.modified) as login_date
from accounts a join
     account_history ah
     on modified_acc_id = a.id
 where ah.data::jsonb->>'message' = 'Logon'
group by a.id, username, email, ah.modified::date