查询大查询?

时间:2016-02-09 03:26:36

标签: sql database google-bigquery user-defined-functions

我在大查询中有一个Package表,如下所示:

 Packageid  Scanid  dispatchid  timestamp   status
   p1         s1       null        t1        'in'
   p2         s1       xxx         t2        'in'
   p1         s2       yyy         t3        'pkin'
   p1         s3       sss         t4        'iwi'
   p1         s4       eee         t5        'lhp'
   p2         s2       uuuu        t6        'uio'
   p2         s3       null        t7        'jsk'

我想检索以下详细信息:

Packageid   Latest-Scanid   First-Dispatch-time  Last-Dispatch-time   latest-status

 p1            s4                 t3                 t5                 'lhp'
 p2            s3                 t2                 t6                 'jsk'  

First-Dispatch-time是第一次调度ID出现在包扫描中的时间。 Last-Dispatch-time是上次调度ID出现在包扫描中的时间。

有没有办法在大查询中使用大查询或uer定义的函数来获取上表?

3 个答案:

答案 0 :(得分:2)

一种方法使用Windows函数和条件聚合:

select packageid,
       max(case when seqnum = 1 then dispatchid end) as dispatchid,
       min(case when dispatchid is not null then timestamp end) as first_dispatchid,
       max(case when dispatchid is not null then timestamp end) as last_dispatchid,
       max(case when seqnum = 1 then status end) as status
from (select t.*,
             row_number() over (partition by packageid order by timestamp desc) as seqnum
      from t
     ) t
group by packageid;

答案 1 :(得分:0)

我会注意到这是针对SQL Server的,可能也可能不适用于MYSQL。

SELECT Packageid, 
    MAX(Scanid) [Latest_Scanid], 
    MIN(timestamp) [First-Dispatch-time], 
    MAX(timestamp) [Last-Dispatch-time],
    (SELECT status FROM Package p WHERE p.timestamp = Package.timestamp AND p.Packageid = Package.Packageid) [latest-status]
FROM Package

答案 2 :(得分:0)

以下查询使用了一个" dirty"技巧(参见not_null_ts)允许消除外部组,而不是在内部选择中计算所有内容

SELECT packageid, latest_scanid, first_dispatch_time, last_dispatch_time, latest_status
FROM (
  SELECT packageid, 
    IF(dispatchid IS NULL, NULL, ts) AS not_null_ts,
    FIRST_VALUE(scanid) OVER(PARTITION BY packageid ORDER BY ts DESC) AS latest_scanid,
    MIN(not_null_ts) OVER(PARTITION BY packageid) AS first_dispatch_time,
    MAX(not_null_ts) OVER(PARTITION BY packageid) AS last_dispatch_time,
    FIRST_VALUE(status) OVER(PARTITION BY packageid ORDER BY ts DESC) AS latest_status,
    ROW_NUMBER() OVER(PARTITION BY packageid ORDER BY not_null_ts DESC) AS line
  FROM YourTable 
)
WHERE line = 1

我发现这种类型的技巧不久前在我身上发挥作用,但我认为我没有明确地看到过这种情况,除非这可能是显而易见的 - 我从来没有想过太多。