vsql / Vertica:按组选择前5行

时间:2019-01-25 14:51:09

标签: sql vertica

我有这个查询,它应该获取分组数据的前n行。我结合使用RANK()OVER PARTITION BY来确定每个组的前n行:

SELECT X.USERID, X.ARTID, X.AVGTIMEONPAGE,EDP.ARTDSC,
RANK() OVER (PARTITION BY X.USERID ORDER BY X.AVGTIMEONPAGE DESC) as rank
FROM
(SELECT GANG.userID AS USERID,GANG.avgTimeOnPage AS AVGTIMEONPAGE,   
split_part(GANG.pageTitle,' -',1) as ARTID
FROM GoogleAnalytics.navigazioneG AS GANG
WHERE GANG.pagePath LIKE '%DataSheets%' ) AS X
LEFT JOIN ESPDDS.ESP_DPRODUCT AS EDP
ON EDP.ARTID=X.ARTID AND EDP.SCD_IS_CURRENT=1
AND EDP.COMPANYID=1
WHERE X.ARTID NOT LIKE '%Company%' AND rank in (1,2,3,4,5)

给我一​​个错误,指出等级列不存在。如果我评论WHERE子句的最后一部分,我可以看到列排名计算正确。

谢谢

2 个答案:

答案 0 :(得分:2)

WHERE子句之前先评估SELECT子句。因此,当时rank是未知的。您可以使用其他子查询来访问它:

SELECT *
FROM
(
  SELECT 
    X.USERID, 
    X.ARTID, 
    X.AVGTIMEONPAGE,
    EDP.ARTDSC,
    RANK() OVER (PARTITION BY X.USERID ORDER BY X.AVGTIMEONPAGE DESC) as rank
  FROM
  (
    SELECT 
      GANG.userID AS USERID,
      GANG.avgTimeOnPage AS AVGTIMEONPAGE,   
      split_part(GANG.pageTitle,' -',1) as ARTID
    FROM GoogleAnalytics.navigazioneG AS GANG
    WHERE GANG.pagePath LIKE '%DataSheets%' 
  ) AS X
  LEFT JOIN ESPDDS.ESP_DPRODUCT AS EDP ON EDP.ARTID = X.ARTID
                                      AND EDP.SCD_IS_CURRENT = 1 
                                      AND EDP.COMPANYID = 1
  WHERE X.ARTID NOT LIKE '%Company%' 
) ranked
WHERE rank in (1,2,3,4,5);

答案 1 :(得分:1)

错误的原因是rank别名在同一级别不可用。另外请注意,请使用dense_rank函数,因为在平局的情况下数字不会被跳过。

SELECT USERID,ARTID, AVGTIMEONPAGE,ARTDSC,RANK
FROM
(SELECT GANG.userID AS USERID
       ,GANG.avgTimeOnPage AS AVGTIMEONPAGE
       ,split_part(GANG.pageTitle,' -',1) as ARTID
       ,RANK() OVER (PARTITION BY X.USERID ORDER BY X.AVGTIMEONPAGE DESC) as rank   
FROM GoogleAnalytics.navigazioneG AS GANG
LEFT JOIN ESPDDS.ESP_DPRODUCT AS EDP ON EDP.ARTID=X.ARTID AND EDP.SCD_IS_CURRENT=1
AND EDP.COMPANYID=1
WHERE GANG.pagePath LIKE '%DataSheets%'
) T 
WHERE ARTID NOT LIKE '%Company%' AND rank <= 5