自我加入QUALIFY RANK()OVER PARTITION BY导致假脱机空间错误

时间:2017-06-19 12:26:10

标签: sql teradata

我在2个具有2列的简化表上执行自连接:SomeId和Measure1(SomeId上有一个主索引)。这是简化的查询:

SELECT
    One.SomeId AS SomeIdOne
    ,Two.SomeId AS SomeIdTwo
FROM SomeTable AS One
INNER JOIN SomeTable AS Two
ON 
    One.SomeId <> Two.SomeId
QUALIFY RANK() OVER (PARTITION BY One.SomeId ORDER BY 
    (One.Measure1 - Two.Measure1) ASC) = 1

我能做些什么来避免假脱机空间错误吗?

PS:

简化示例:

DROP TABLE SomeTable;
CREATE VOLATILE TABLE SomeTable
(
        SomeId INT,
        Measure1 DECIMAL(14,4)
) ON COMMIT PRESERVE ROWS;
INSERT INTO SomeTable (SomeId, Measure1) VALUES (1, 3.0);
INSERT INTO SomeTable (SomeId, Measure1) VALUES (2, 4.0);
INSERT INTO SomeTable (SomeId, Measure1) VALUES (3, 5.0);

期望的结果:

SomeIdOne   SomeIdTwo   Distance
1   2   1.0000
2   1   1.0000
3   2   2.0000

可能但效率低下的查询(请参阅问题):

SELECT
    One.SomeId AS SomeIdOne
    ,Two.SomeId AS SomeIdTwo
    ,ABS(One.Measure1 - Two.Measure1) AS Distance
FROM SomeTable AS One
INNER JOIN SomeTable AS Two
ON 
    One.SomeId <> Two.SomeId
QUALIFY ROW_NUMBER() OVER (PARTITION BY One.SomeId ORDER BY 
    (ABS(One.Measure1 - Two.Measure1)) ASC) = 1; 

1 个答案:

答案 0 :(得分:2)

按尺寸订购数据时,很容易找到最小距离,它可以是前一行或下一行。

SELECT 
   dt.SomeId
  ,CASE 
     WHEN prevDist IS NULL    THEN nextID
     WHEN nextDist IS NULL    THEN prevID
     WHEN prevDist < nextDist THEN prevID
     ELSE nextID
   END
  ,CASE 
     WHEN prevDist IS NULL    THEN nextDist -- first row
     WHEN nextDist IS NULL    THEN prevDist -- last row
     WHEN prevDist < nextDist THEN prevDist -- chose the smaller value
     ELSE nextDist
   END  
FROM 
 (
   SELECT SomeId
     ,Measure1 -
      Min(Measure1) -- previous row
      Over (ORDER BY Measure1 
            ROWS BETWEEN 1 Preceding AND 1 Preceding) AS prevDist
     ,Min(Measure1) -- next row 
      Over (ORDER BY Measure1 
            ROWS BETWEEN 1 Following AND 1 Following)
      - Measure1                                      AS nextDist
     ,Min(SomeId) -- previous row 
      Over (ORDER BY Measure1 
            ROWS BETWEEN 1 Preceding AND 1 Preceding) AS prevID
     ,Min(SomeId) -- next row  
      Over (ORDER BY Measure1 
            ROWS BETWEEN 1 Following AND 1 Following) AS nextID
   FROM SomeTable AS t
 ) AS dt

关于tie当前随机返回一行,您可以将另一列添加到ORDER BY并更改CASE逻辑以获取具有更高/更低Id的特定行。

Btw,授予它敏感的数据,但是你能解释一下这背后的实际业务问题吗?