SQL:如何“聚合序列”?

时间:2018-06-21 09:16:26

标签: sql teradata

我的问题有点难以解释。 我有一张看起来像这样的桌子

column1 | column2 | date
------------------------------------------                                      
  u01     test       2001-01-01
  u01     test       2001-02-01
  u01     test2      2001-03-01
  u01     test2      2001-04-01
  u01     test3      2001-05-01
  u01     test       2001-06-01

在目标表中,我希望汇总相同的值,但前提是它们必须彼此“跟从”。这意味着我的目标表将如下所示:

   column1 | column2 | validfrom        validto
    ------------------------------------------                                      
      u01     test       2001-01-01     2001-03-01
      u01     test2      2001-03-01     2001-05-01   
      u01     test3      2001-05-01     2001-06-01
      u01     test       2001-06-01

我尝试使用行号,因此目前我得到一些编号的行,但是仍然存在问题,我不知道如何“聚合序列”。

enter image description here

任何想法或方法都值得赞赏!

3 个答案:

答案 0 :(得分:3)

Teradata中有一个很好的扩展,可以标准化周期:

SELECT
   column1
  ,column2
  -- split the Period into seperate columns again
  ,Begin(pd)
  ,NullIf(End(pd), DATE '9999-12-31')
FROM
 (
   SELECT NORMALIZE -- normalize overlapping periods
      column1
     ,column2
      -- NORMALIZE only works with periods, so create a Period based on current & next row
     ,PERIOD(date
            ,Coalesce(Lead(date) 
                      Over (PARTITION BY column1 
                            ORDER BY date)
                     ,DATE '9999-12-31')
            ) AS pd
   FROM tab
 ) AS dt

如果您的Teradata版本不支持LEAD,则可以改用它:

Min(date) 
Over  (PARTITION BY column1 
       ORDER BY date
       ROWS BETWEEN 1 Following and 1 Following)

答案 1 :(得分:1)

这是一个孤岛问题。这是使用行号的解决方案:

select column1, column2, min(date), max(date)
from (select t.*,
             row_number() over (partition by column1 order by date) as seqnum_1,
             row_number() over (partition by column1, column2 order by date) as seqnum_2
      from t
     ) t
group by column1, column2, (seqnum_1 - seqnum_2);

为什么这样做有效,所以很难解释。我发现如果您查看子查询的结果,这是很明显的。您将看到行号之间的差异如何定义您要查找的组。

答案 2 :(得分:0)

这应该适合您的情况

        set @rnum = 0   
        set @col1 = ''    
        set @col2 = ''    
        SELECT YY.col1 AS col1, YY.col2 AS col2, rr.aamin AS valid_from, rr.bbmin AS valid_to
        FROM (
            (
                SELECT col1, col2, num
                FROM (
                    SELECT CASE 
                            WHEN @col1 = @col1
                                THEN @rnum
                            ELSE @rnum + 1
                            END AS num, @col1 = column1 AS column1, @col2 = column2 AS column1, DATE_1
                    FROM test t
                    )
                GROUP BY col1, col2, num
                ) YY INNER JOIN (
                SELECT *
                FROM (
                    SELECT num AS aanum, min(AA.DATE_1) AS aamin
                    FROM (
                        SELECT CASE 
                                WHEN @col1 = @col1
                                    THEN @rnum
                                ELSE @rnum + 1
                                END AS num, @col1 = column1 AS column1, @col2 = column2 AS column1, DATE_1
                        FROM test t
                        ) AA
                    ) GG
                LEFT JOIN (
                    SELECT num AS bbnum, min(DATE_1) AS bbmin(SELECT CASE 
                                WHEN @col1 = @col1
                                    THEN @rnum
                                ELSE @rnum + 1
                                END AS num, @col1 = column1 AS column1, @col2 = column2 AS column1, DATE_1 FROM test t)
                    GROUP BY num
                    ) BB
                    ON (GG.aanum + 1 = BB.bbnum)
                )
            ) RR
            ON RR.aanum = yy.num