我在SQLWorkbenchJ中使用PostgreSQL而且我在苦苦挣扎。
我有一个WITH
语句,根据行号选择日期。如果语句找不到行号,我想在日期字段中选择null。这当前不会发生,它只选择所有字段都不为空的记录。我假设它与联接有关,但我不确定。
目前的声明如下。它应该返回大约50,000条记录,但它目前返回到2000以下。
WITH FifthEnquiry AS
(
SELECT emailaddress,
SentDate,
ROW_NUMBER() OVER (PARTITION BY emailaddress ORDER BY COUNT(*) DESC) AS rk
FROM SentEmails
GROUP BY emailaddress,
SentDate
),
TenthEnquiry AS
(
SELECT emailaddress,
SentDate,
ROW_NUMBER() OVER (PARTITION BY emailaddress ORDER BY COUNT(*) DESC) AS rk
FROM SentEmails
GROUP BY emailaddress,
SentDate
),
TwentiethEnquiry AS
(
SELECT emailaddress,
SentDate,
ROW_NUMBER() OVER (PARTITION BY emailaddress ORDER BY COUNT(*) DESC) AS rk
FROM SentEmails
GROUP BY emailaddress,
SentDate
)
SELECT FifthEnquiry.emailaddress,
FifthEnquiry.SentDate AS Fith,
TenthEnquiry.SentDate AS Tenth,
TwentiethEnquiry.SentDate AS Twentieth,
FROM FifthEnquiry
JOIN TenthEnquiry ON FifthEnquiry.emailaddress = TenthEnquiry.emailaddress
JOIN TwentiethEnquiry ON FifthEnquiry.emailaddress = TwentiethEnquiry.emailaddress
WHERE (FifthEnquiry.rk = 5)
AND (TenthEnquiry.rk = 10)
AND (TwentiethEnquiry.rk = 20)
答案 0 :(得分:3)
你可以大大简化。并使用LEFT JOIN
保留GROUP BY
后至少5行的所有电子邮件地址,即使没有第10行或第20行:
WITH cte AS (
SELECT emailaddress, SentDate,
ROW_NUMBER() OVER (PARTITION BY emailaddress
ORDER BY COUNT(*) DESC, SentDate) AS rn
FROM SentEmails
GROUP BY 1,2
)
SELECT enq05.emailaddress,
enq05.SentDate AS fifth,
enq10.SentDate AS tenth,
enq20.SentDate AS twentieth
FROM cte AS enq05
LEFT JOIN cte AS enq10 ON enq10.emailaddress = enq05.emailaddress
AND enq10.rn = 10
LEFT JOIN cte AS enq20 ON enq20.emailaddress = enq05.emailaddress
AND enq20.rn = 20
WHERE enq05.rn = 5;
您不需要单独的CTE,三者都在做同样的事情。 一个CTE 就足够了,显然更快。改为在外部查询中使用具有不同表别名的自联接。
由于我们现在正在使用 LEFT JOIN
,因此我们是否在JOIN或WHERE子句中添加了其他条件。 WHERE子句中的条件有效地强制Postgres将连接视为普通[INNER] JOIN
。我相应地将条件移动到JOIN子句。详细说明:
使用rn
,而不是rk
作为列别名。这是一个“行号”,而不是“排名”。请注意row_number()
and rank()
之间行为中的重要差异。
将SentDate
添加到 ORDER BY
作为(emailaddress, SentDate)
的决胜局,使用相同的计数来获得稳定的排序顺序。我拥有它的方式SentDate IS NULL
最后是每组。您可能希望使用NULLS LAST
来降序排序(不适用于COUNT(*)
,它永远不会为NULL):
您需要注意的另一个细微的细节:tenth
和twentieth
在两个不同的原因的结果中都可以为NULL,如果{{ 1}}在底层表中可以为NULL。结果中SentDate
的NULL值可能意味着tenth
的值不到10个,或者它可能意味着NULL根据您的排序顺序位于第10个位置。