我试图在数据库中获得符合某些条件的6个月实体趋势,但问题是我需要深入嵌套几个级别以确定实体是否符合条件。
实体是“成员”,可能有多个“帐户”,我需要确保在我加入之前,他们的帐户都没有设置任何标记。
如果我想在特定日期获得一个计数(我们保留历史数据),我会做类似的事情:
SELECT COUNT(sup.SSN)
FROM MemberSuppTable as sup
WHERE (
sup.ProcessDate = @PROCESSDATE
AND sup.MemberSuppID IN (
SELECT summ.MemberSuppID
FROM MemberSummaryTable as summ
WHERE (
summ.ProcessDate = @PROCESSDATE
AND summ.AccountNumber IN (
SELECT acct.AccountNumber
FROM AccountTable as acct
WHERE (
acct.ProcessDate = @PROCESSDATE
--other criteria for account exclusion go here.
)
)
)
)
)
MemberSuppTable
有关于成员的高级别信息:
(ID, FirstAccountOpenDate, status, etc)
MemberSummaryTable
将帐户与MemberSuppTable
:
(AccountNumber, MemberSuppID, ...)
现在,我正在尝试计算月末流程日期,按单个查询中的流程日期进行分组。
因此,上述查询将返回
ssn count
----------
1,000,000
我想:
process date | ssn count
------------------------
20160430 | 8,000,000
20160551 | 8,500,000
... | ...
20160331 | 1,000,000
到目前为止,我已经提出了以下内容(请参阅下文,了解其无法解决的原因):
WITH valid_dates AS (
SELECT D.ProcessDate
FROM arcu.vwARCUProcessDates AS D
WHERE d.FullDate = D.MonthEndDate
AND d.ProcessDate >= @SDATE
)
SELECT sup.ProcessDate, COUNT(DISTINCT sup.SSN)
FROM MemberSuppTable as sup
WHERE (
AND sup.ProcessDate IN (SELECT * FROM valid_dates)
AND sup.MemberSuppID IN (
SELECT summ.MemberSuppID
FROM MemberSummaryTable as summ
WHERE (
summ.ProcessDate IN (SELECT * FROM valid_dates)
AND summ.AccountNumber IN (
SELECT acct.AccountNumber
FROM AccountTable as acct
WHERE (
acct.ProcessDate IN (SELECT * FROM valid_dates)
...
)
)
)
)
)
GROUP BY (sup.ProcessDate)
但是,通过上述查询,我相信如果成员符合valid_dates表中任何进程日期的条件,那么它将包含在所有组中。
任何人都可以帮助我吗? (我是SQL的新手,所以请原谅我,如果我错过了一些简单的东西。)
答案 0 :(得分:1)
IN子句对于这样的查询完全没问题。比连接更易读,因为您清楚地显示了从哪个表中选择数据以及仅访问哪些表以检查记录是否存在。这个结构很好,并且表明您已经对查询进行了一些思考。
但是,如果没有不必要的别名和括号,您的查询将更具可读性。
无论如何,你想使用你在子查询中找到的相同的进程日期,我猜,所以相应地增强你的IN子句:
select processdate, count(distinct ssn)
from membersupptable
where (processdate, membersuppid) in
(
select processdate, membersuppid
from membersummarytable
where (processdate, accountnumber) in
(
select processdate, accountnumber
from accounttable
where processdate in
(
select processdate
from vwarcuprocessdates
where fulldate = monthenddate
and processdate >= @sdate
)
)
)
group by processdate;
答案 1 :(得分:1)
首先,我会使用INNER JOIN
而不是WHERE .. IN
重写您的第一个查询:
SELECT COUNT(DISTINCT sup.SSN)
FROM MemberSuppTable as sup
INNER JOIN MemberSummaryTable AS summ
ON summ.MemberSuppID = sup.MemberSuppID
INNER JOIN AccountTable AS acct
ON acct.AccountNumber = summ.AccountNumber
WHERE sup.ProcessDate = @PROCESSDATE
AND summ.ProcessDate = @PROCESSDATE
AND acct.ProcessDate = @PROCESSDATE
-- other criteria for account exclusion go here.
这看起来更紧凑,(IMHO)更具可读性。
现在我会改变查询的方式,@PROCESSDATE
只发生一次
SELECT COUNT(DISTINCT sup.SSN)
FROM MemberSuppTable as sup
INNER JOIN MemberSummaryTable AS summ
ON summ.MemberSuppID = sup.MemberSuppID
INNER JOIN AccountTable AS acct
ON acct.AccountNumber = summ.AccountNumber
WHERE sup.ProcessDate = @PROCESSDATE
AND summ.ProcessDate = sup.ProcessDate
AND acct.ProcessDate = sup.ProcessDate
-- other criteria for account exclusion go here.
您可以将条件保留在WHERE
子句中,但我更喜欢它们在ON
子句中
SELECT COUNT(DISTINCT sup.SSN)
FROM MemberSuppTable AS sup
INNER JOIN MemberSummaryTable AS summ
ON summ.MemberSuppID = sup.MemberSuppID
AND summ.ProcessDate = sup.ProcessDate
INNER JOIN AccountTable AS acct
ON acct.AccountNumber = summ.AccountNumber
AND acct.ProcessDate = sup.ProcessDate
WHERE sup.ProcessDate = @PROCESSDATE
-- other criteria for account exclusion go here.
现在很容易获得每个ProcessDate的COUNT
SELECT sup.ProcessDate, COUNT(DISTINCT sup.SSN)
FROM MemberSuppTable as sup
INNER JOIN MemberSummaryTable AS summ
ON summ.MemberSuppID = sup.MemberSuppID
AND summ.ProcessDate = sup.ProcessDate
INNER JOIN AccountTable AS acct
ON acct.AccountNumber = summ.AccountNumber
AND acct.ProcessDate = sup.ProcessDate
-- WHERE criteria for account exclusion go here.
GROUP BY sup.ProcessDate
还要过滤" valid_dates"它只是一个额外的JOIN
和一些WHERE
条件
SELECT sup.ProcessDate, COUNT(DISTINCT sup.SSN)
FROM MemberSuppTable as sup
INNER JOIN MemberSummaryTable AS summ
ON summ.MemberSuppID = sup.MemberSuppID
AND summ.ProcessDate = sup.ProcessDate
INNER JOIN AccountTable AS acct
ON acct.AccountNumber = summ.AccountNumber
AND acct.ProcessDate = sup.ProcessDate
INNER JOIN arcu.vwARCUProcessDates AS d
ON d.ProcessDate = sup.ProcessDate
WHERE d.FullDate = d.MonthEndDate
AND d.ProcessDate >= @SDATE
-- AND criteria for account exclusion go here.
GROUP BY sup.ProcessDate
为了获得更好的效果,GROUP BY d.ProcessDate
可能会更好,但不要忘记调整SELECT
部分。
修改强>
如评论中所述,如果每个SSN都要计算一次,则必须使用DISTINCT
关键字。所以我编辑了解决方案。
还必须注意,即使使用DISTINCT
,第一个查询也不会完全等同于原始查询。如果sup.SSN
不唯一,则查询可能会返回不同的结果。