我有以下查询,我从左外连接表中获取USRID的计数。 PS_HS_AUD表中的计数关闭1个记录,PS_HS_PRE表中的计数关闭1个(总计数减2个)。
我认为计数已关闭,因为在PS_HS_AUD表和另一个名为PS_HS_ANN的表中都存在USRID, AND USRID在表PS_HS_ANN中有2行(每行都有唯一的检查日期)。我有以下查询,我添加了获得MAX EXAM_DT的条件,希望它能得到正确的总数,但是我得到的结果与之前在WHERE子句中添加MAX考试日期标准相同(不正确)。 / p>
当前SQL:
SELECT 'ZTOTAL', '', COUNT(G.USRID), COUNT(H.USRID), COUNT( J.USRID),
COUNT(M.USRID), COUNT(P.USRID), COUNT(S.USRID), COUNT(V.USRID),
COUNT(Y.USRID)
FROM PS_JOB
LEFT OUTER JOIN PS_HS_ANN G ON F.USRID = G.USRID AND G.EMPL_RCD =
F.EMPL_RCD
LEFT OUTER JOIN PS_HS_ANT H ON F.USRID = H.USRID AND H.EMPL_RCD =
F.EMPL_RCD
LEFT OUTER JOIN PS_HS_AUD J ON F.USRID = J.USRID AND J.EMPL_RCD =
F.EMPL_RCD
LEFT OUTER JOIN PS_HS_DOT M ON F.USRID = M.USRID AND M.EMPL_RCD =
F.EMPL_RCD
LEFT OUTER JOIN PS_HS_HAZ P ON F.USRID = P.USRID AND P.EMPL_RCD =
F.EMPL_RCD
LEFT OUTER JOIN PS_HS_PRE S ON F.USRID = S.USRID AND S.EMPL_RCD =
F.EMPL_RCD
LEFT OUTER JOIN PS_HS_RES V ON F.USRID = V.USRID AND V.EMPL_RCD =
F.EMPL_RCD
LEFT OUTER JOIN PS_HS_ASB Y ON F.USRID = Y.USRID AND Y.EMPL_RCD =
F.EMPL_RCD
WHERE ( ( F.EFFDT =
(SELECT MAX(F_ED.EFFDT) FROM PS_JOB F_ED
WHERE F.USRID = F_ED.USRID
AND F.EMPL_RCD = F_ED.EMPL_RCD
AND F_ED.EFFDT <= SUBSTRING(CONVERT(CHAR,GETDATE(),121), 1, 10))
AND F.EFFSEQ =
(SELECT MAX(F_ES.EFFSEQ) FROM PS_JOB F_ES
WHERE F.USRID = F_ES.USRID
AND F.EMPL_RCD = F_ES.EMPL_RCD
AND F.EFFDT = F_ES.EFFDT) )
AND (G.EXAM_DT = (SELECT MAX(GG.EXAM_DT) FROM PS_HS_ANN GG
WHERE GG.USRID = G.USRID
AND GG.EMPL_RCD = G.EMPL_RCD
AND GG.EXAM_DT = G.EXAM_DT)
OR H.EXAM_DT = (SELECT MAX(HH.EXAM_DT) FROM PS_HS_ANT HH
WHERE HH.USRID = H.USRID
AND HH.EMPL_RCD = H.EMPL_RCD
AND HH.EXAM_DT = H.EXAM_DT)
OR J.EXAM_DT = (SELECT MAX(JJ.EXAM_DT) FROM PS_HS_AUD JJ
WHERE JJ.USRID = J.USRID
AND JJ.EMPL_RCD = J.EMPL_RCD
AND JJ.EXAM_DT = J.EXAM_DT)
OR M.EXAM_DT = (SELECT MAX(MM.EXAM_DT) FROM PS_GHS_HS_DOT MM
WHERE MM.USRID = M.USRID
AND MM.EMPL_RCD = M.EMPL_RCD
AND MM.EXAM_DT = M.EXAM_DT)
OR P.EXAM_DT = (SELECT MAX(PP.EXAM_DT) FROM PS_GHS_HS_HAZMAT PP
WHERE PP.USRID = P.USRID
AND PP.EMPL_RCD = P.EMPL_RCD
AND PP.EXAM_DT = P.EXAM_DT)
OR S.EXAM_DT = (SELECT MAX(SS.EXAM_DT) FROM PS_HS_PRE SS
WHERE SS.USRID = S.USRID
AND SS.EMPL_RCD = S.EMPL_RCD
AND SS.EXAM_DT = S.EXAM_DT)
OR V.EXAM_DT = (SELECT MAX(VV.EXAM_DT) FROM PS_GH_RESP_FIT VV
WHERE VV.USRID = V.USRID
AND VV.EMPL_RCD = V.EMPL_RCD
AND VV.EXAM_DT = V.EXAM_DT)
OR Y.EXAM_DT = (SELECT MAX(YY.EXAM_DT) FROM PS_HS_ASB YY
WHERE YY.USRID = Y.USRID
AND YY.EMPL_RCD = Y.EMPL_RCD
AND YY.EXAM_DT = Y.EXAM_DT) ))
上面的第5列(J.USRID)显示了5条记录的计数,尽管从PS_HS_AUD J表中的以下查询可以看出,只有4条记录。 (见下表):
如果我查询PS_HS_ANN表,您可以看到 USRID SD3925 (在PS_HS_AUD中也有记录)表中有2行。我相信这是导致在PS_HS_AUD中计算额外行的原因(就好像我用PS_HS_ANN注释掉连接,然后我的计数显示正确为4条记录)。
PS_HS_ANN:
同样的问题也出现在PS_HS_PRE表中(由于同样的USRID而重复)我还能用什么来防止这种情况发生?肯定会出现USRID在每个表中的多行中存在的情况。谢谢!
4/16/18更新:有没有人有任何其他想法我怎么能让这个工作?
答案 0 :(得分:0)
解决问题的正确方法是在 join
之前进行聚合。鉴于查询的复杂性,这可能很难实现。
快速而肮脏的方法是使用count(distinct)
:
SELECT 'ZTOTAL', '',
COUNT(DISTINCT G.USRID), COUNT(DISTINCT H.USRID), COUNT(DISTINCT J.USRID),
COUNT(DISTINCT M.USRID), COUNT(DISTINCT P.USRID), COUNT(DISTINCT S.USRID), COUNT(V.USRID),
COUNT(DISTINCT Y.USRID)
. . .
这是不太理想的,因为每个COUNT(DISTINCT)
都会产生开销 - 并且沿着每个表预先聚合会消除这种情况。另外,中间结果可能会变得非常大。也影响了业绩。