netezza sql:比较分区中字段的所有值,找到N个最小差异的记录

时间:2014-10-01 17:14:09

标签: netezza

警告:没有写入/运行PROC的权限,也无法创建/修改临时表。

使用字段'unique_primary_key','cust_num'和'sale_date'给多字节表'table1'。 对于'cust_num'的每个不同值,存在多个记录,其值为'sale_date'。

我需要:在通过'cust_num'进行分区后,找到有问题记录的'sale_date'与分区内每隔一条记录的'sale_date'之间差异的最小值。此外,该差异的第二个最小值,第三个最小值。

每个不同的'cust_num'在表格中有3到75个记录(所有记录都有不同的日期),所以只需在分区内按'sale_date'排序,然后继续明确地将'sale_date'值与其他所有记录进行比较分区不可行。

我知道如何使用INDEX函数和MATCH作为参数之一等在Excel中轻松完成此操作,但我不知道SQL中的类似过程。

1 个答案:

答案 0 :(得分:0)

如果你只想要一个前瞻性或后视的日期差异衡量标准,我认为你可以采用简单的滞后/牵头方法来做到这一点。

 select * from table1 order by cust_num, sale_date;
 UNIQUE_PRIMARY_KEY | CUST_NUM | SALE_DATE
--------------------+----------+------------
                  7 |        1 | 2014-10-06
                  2 |        1 | 2014-10-12
                  5 |        1 | 2014-10-14
                  1 |        1 | 2014-10-17
                  4 |        1 | 2014-10-19
                  3 |        1 | 2014-10-22
                  6 |        1 | 2014-10-25
             100008 |        5 | 2014-10-07
             300002 |        5 | 2014-10-13
             100006 |        5 | 2014-10-15
             200003 |        5 | 2014-10-18
             200004 |        5 | 2014-10-20
             100005 |        5 | 2014-10-23
             100007 |        5 | 2014-10-26

则...

SELECT UNIQUE_PRIMARY_KEY pk,
   CUST_NUM,
   SALE_DATE,
   lead(sale_date,1) over (partition BY cust_num ORDER BY sale_date) lead1,
   lead(sale_date,2) over (partition BY cust_num ORDER BY sale_date) lead2,
   lead(sale_date,3) over (partition BY cust_num ORDER BY sale_date) lead3
FROM table1
ORDER BY CUST_NUM,
   SALE_DATE

   PK   | CUST_NUM | SALE_DATE  |   LEAD1    |   LEAD2    |   LEAD3
--------+----------+------------+------------+------------+------------
      7 |        1 | 2014-10-06 | 2014-10-12 | 2014-10-14 | 2014-10-17
      2 |        1 | 2014-10-12 | 2014-10-14 | 2014-10-17 | 2014-10-19
      5 |        1 | 2014-10-14 | 2014-10-17 | 2014-10-19 | 2014-10-22
      1 |        1 | 2014-10-17 | 2014-10-19 | 2014-10-22 | 2014-10-25
      4 |        1 | 2014-10-19 | 2014-10-22 | 2014-10-25 |
      3 |        1 | 2014-10-22 | 2014-10-25 |            |
      6 |        1 | 2014-10-25 |            |            |
 100008 |        5 | 2014-10-07 | 2014-10-13 | 2014-10-15 | 2014-10-18
 300002 |        5 | 2014-10-13 | 2014-10-15 | 2014-10-18 | 2014-10-20
 100006 |        5 | 2014-10-15 | 2014-10-18 | 2014-10-20 | 2014-10-23
 200003 |        5 | 2014-10-18 | 2014-10-20 | 2014-10-23 | 2014-10-26
 200004 |        5 | 2014-10-20 | 2014-10-23 | 2014-10-26 |
 100005 |        5 | 2014-10-23 | 2014-10-26 |            |
 100007 |        5 | 2014-10-26 |            |            |
(14 rows)

在更可能的情况下,你希望它是日期差异的ABS,我的大脑仍然想要使用滞后&领导,但是它正在努力摆脱它的悲伤能力,并且必须以更加复杂的方式将它拼凑在一起:

WITH foo AS
   (
      SELECT UNIQUE_PRIMARY_KEY pk,
         cust_num,
         sale_date,
         lead(sale_date,1) over (partition BY cust_num ORDER BY sale_date)                  rel_sales_date,
         ABS(sale_date - lead(sale_date,1) over (partition BY cust_num ORDER BY sale_date)) day_delta
      FROM table1

      UNION ALL

      SELECT UNIQUE_PRIMARY_KEY pk,
         cust_num,
         sale_date,
         lead(sale_date,2) over (partition BY cust_num ORDER BY sale_date)                  rel_sales_date,
         ABS(sale_date - lead(sale_date,2) over (partition BY cust_num ORDER BY sale_date)) day_delta
      FROM table1

      UNION ALL

      SELECT UNIQUE_PRIMARY_KEY pk,
         cust_num,
         sale_date,
         lead(sale_date,3) over (partition BY cust_num ORDER BY sale_date)                  rel_sales_date,
         ABS(sale_date - lead(sale_date,3) over (partition BY cust_num ORDER BY sale_date)) day_delta
      FROM table1

      UNION ALL

      SELECT UNIQUE_PRIMARY_KEY pk,
         cust_num,
         sale_date,
         lag(sale_date,1) over (partition BY cust_num ORDER BY sale_date)                  rel_sales_date,
         ABS(sale_date - lag(sale_date,1) over (partition BY cust_num ORDER BY sale_date)) day_delta
      FROM table1

      UNION ALL

      SELECT UNIQUE_PRIMARY_KEY pk,
         cust_num,
         sale_date,
         lag(sale_date,2) over (partition BY cust_num ORDER BY sale_date)                  rel_sales_date,
         ABS(sale_date - lag(sale_date,2) over (partition BY cust_num ORDER BY sale_date)) day_delta
      FROM table1

      UNION ALL

      SELECT UNIQUE_PRIMARY_KEY pk,
         cust_num,
         sale_date,
         lag(sale_date,3) over (partition BY cust_num ORDER BY sale_date)                  rel_sales_date,
         ABS(sale_date - lag(sale_date,3) over (partition BY cust_num ORDER BY sale_date)) day_delta
      FROM table1
   )
SELECT pk,
   cust_num,
   sale_date,
   rel_sales_date,
   day_delta,
   day_rank
FROM (
      SELECT pk ,
         cust_num,
         sale_date,
         rel_sales_date,
         day_delta,
         dense_rank() over (partition BY pk ORDER BY day_delta nulls last) day_rank
      FROM foo
   )
   foob
WHERE day_rank <= 3
ORDER BY cust_num,
   sale_date,
   day_rank


   PK   | CUST_NUM | SALE_DATE  | REL_SALES_DATE | DAY_DELTA | DAY_RANK
--------+----------+------------+----------------+-----------+----------
      7 |        1 | 2014-10-06 | 2014-10-12     |         6 |        1
      7 |        1 | 2014-10-06 | 2014-10-14     |         8 |        2
      7 |        1 | 2014-10-06 | 2014-10-17     |        11 |        3
      2 |        1 | 2014-10-12 | 2014-10-14     |         2 |        1
      2 |        1 | 2014-10-12 | 2014-10-17     |         5 |        2
      2 |        1 | 2014-10-12 | 2014-10-06     |         6 |        3
      5 |        1 | 2014-10-14 | 2014-10-12     |         2 |        1
      5 |        1 | 2014-10-14 | 2014-10-17     |         3 |        2
      5 |        1 | 2014-10-14 | 2014-10-19     |         5 |        3
      1 |        1 | 2014-10-17 | 2014-10-19     |         2 |        1
      1 |        1 | 2014-10-17 | 2014-10-14     |         3 |        2
      1 |        1 | 2014-10-17 | 2014-10-12     |         5 |        3
      1 |        1 | 2014-10-17 | 2014-10-22     |         5 |        3
      4 |        1 | 2014-10-19 | 2014-10-17     |         2 |        1
      4 |        1 | 2014-10-19 | 2014-10-22     |         3 |        2
      4 |        1 | 2014-10-19 | 2014-10-14     |         5 |        3
      3 |        1 | 2014-10-22 | 2014-10-25     |         3 |        1
      3 |        1 | 2014-10-22 | 2014-10-19     |         3 |        1
      3 |        1 | 2014-10-22 | 2014-10-17     |         5 |        2
      3 |        1 | 2014-10-22 | 2014-10-14     |         8 |        3
      6 |        1 | 2014-10-25 | 2014-10-22     |         3 |        1
      6 |        1 | 2014-10-25 | 2014-10-19     |         6 |        2
      6 |        1 | 2014-10-25 | 2014-10-17     |         8 |        3
 100008 |        5 | 2014-10-07 | 2014-10-13     |         6 |        1
 100008 |        5 | 2014-10-07 | 2014-10-15     |         8 |        2
 100008 |        5 | 2014-10-07 | 2014-10-18     |        11 |        3
 300002 |        5 | 2014-10-13 | 2014-10-15     |         2 |        1
 300002 |        5 | 2014-10-13 | 2014-10-18     |         5 |        2
 300002 |        5 | 2014-10-13 | 2014-10-07     |         6 |        3
 100006 |        5 | 2014-10-15 | 2014-10-13     |         2 |        1
 100006 |        5 | 2014-10-15 | 2014-10-18     |         3 |        2
 100006 |        5 | 2014-10-15 | 2014-10-20     |         5 |        3
 200003 |        5 | 2014-10-18 | 2014-10-20     |         2 |        1
 200003 |        5 | 2014-10-18 | 2014-10-15     |         3 |        2
 200003 |        5 | 2014-10-18 | 2014-10-13     |         5 |        3
 200003 |        5 | 2014-10-18 | 2014-10-23     |         5 |        3
 200004 |        5 | 2014-10-20 | 2014-10-18     |         2 |        1
 200004 |        5 | 2014-10-20 | 2014-10-23     |         3 |        2
 200004 |        5 | 2014-10-20 | 2014-10-15     |         5 |        3
 100005 |        5 | 2014-10-23 | 2014-10-20     |         3 |        1
 100005 |        5 | 2014-10-23 | 2014-10-26     |         3 |        1
 100005 |        5 | 2014-10-23 | 2014-10-18     |         5 |        2
 100005 |        5 | 2014-10-23 | 2014-10-15     |         8 |        3
 100007 |        5 | 2014-10-26 | 2014-10-23     |         3 |        1
 100007 |        5 | 2014-10-26 | 2014-10-20     |         6 |        2
 100007 |        5 | 2014-10-26 | 2014-10-18     |         8 |        3
(46 rows)

希望这有帮助,如果不是,我至少在本周五下午进行一些心理锻炼。