根据两列的组合选择count Distinct

时间:2018-05-08 12:04:34

标签: sql sql-server-2014

我有以下查询,它从各个表中获取不同USERID的计数,并将它们总计为总计。我希望结果总共 35 ,但是此查询导致我只得 30 。它似乎正在做的是当它在任何表中的多个行中找到相同的USERID时,它只计算一次它(根据它是如何在一个表中出现多次,这是很好的USERID'结构化的)。

我希望基于USERID和EXAM_DT的组合获得不同的值,因为这种组合将满足我所需的唯一性。

SQL:

SELECT 'TOTAL', '', COUNT (DISTINCT G.USERID) + COUNT (DISTINCT H.USERID) + 
COUNT (DISTINCT J.USERID) + COUNT (DISTINCT M.USERID) + COUNT (DISTINCT 
P.USERID) + COUNT(DISTINCT S.USERID) + COUNT (DISTINCT V.USERID) + COUNT ( 
DISTINCT Y.USERID) 

FROM PS_JOB F INNER JOIN PS_EMPLMT_SRCH_QRY F1 ON (F.USERID = 
F1.USERID AND F.EMPL_RCD = F1.EMPL_RCD ) 
LEFT OUTER JOIN  PS_GHS_HS_ANN_EXAM G ON  F.USERID = G.USERID AND G.EMPL_RCD 
 = F.EMPL_RCD  
LEFT OUTER JOIN  PS_GHS_HS_ANTINEO H ON  F.USERID = H.USERID AND H.EMPL_RCD 
 = F.EMPL_RCD  
LEFT OUTER JOIN  PS_GHS_HS_AUDIO J ON  F.USERID = J.USERID AND J.EMPL_RCD = 
 F.EMPL_RCD  
LEFT OUTER JOIN  PS_GHS_HS_DOT M ON  F.USERID = M.USERID AND M.EMPL_RCD = 
 F.EMPL_RCD  
LEFT OUTER JOIN  PS_GHS_HS_HAZMAT P ON  F.USERID = P.USERID AND P.EMPL_RCD = 
 F.EMPL_RCD  
LEFT OUTER JOIN  PS_GHS_HS_PREPLACE S ON  F.USERID = S.USERID AND S.EMPL_RCD 
 = F.EMPL_RCD  
LEFT OUTER JOIN  PS_GH_RESP_FIT V ON  F.USERID = V.USERID AND V.EMPL_RCD = 
 F.EMPL_RCD  
LEFT OUTER JOIN  PS_GHS_HS_ASBESTOS Y ON  F.USERID = Y.USERID AND Y.USERID = 
 F.USERID

WHERE ( ( F.EFFDT = 
    (SELECT MAX(F_ED.EFFDT) FROM PS_JOB F_ED 
    WHERE F.USERID = F_ED.USERID 
      AND F.EMPL_RCD = F_ED.EMPL_RCD 
      AND F_ED.EFFDT <= SUBSTRING(CONVERT(CHAR,GETDATE(),121), 1, 10)) 
AND F.EFFSEQ = 
    (SELECT MAX(F_ES.EFFSEQ) FROM PS_JOB F_ES 
    WHERE F.USERID = F_ES.USERID 
      AND F.EMPL_RCD = F_ES.EMPL_RCD 
      AND F.EFFDT = F_ES.EFFDT) ))

我的结果:

 (No column name)      (No column name)      (No column name)
  TOTAL                                       30

以下是查询中包含USERID 816455 两次的一个表的示例,但仅计算(在上面的查询中)一个不同的事件(当我需要distinct时)基于USERID和EXAM_DT的组合)

 USERID       USER_RCD       EXAM_DT       EXAM_TYPE_CD       EXPIRE_DT
 001          0              2018-04-17    ANN                2019-04-17
 03           0              2018-04-03    ANN                2019-04-27
 816455       0              2018-03-02    ANN                2018-03-31
 816455       0              2018-03-26    ANN                2018-06-30
 410908       0              2018-03-05    ANN                2019-05-30

我希望尽可能避免使用子查询对连接进行聚合,因为我需要将sql添加到不支持该用法的工具中。任何帮助表示赞赏!

修改

正如LukStorms建议我尝试过#34;方法1&#34;从他的回答如下:

 SELECT count (distinct concat(G.USERID, G.EXAM_DT)) 
 + count (distinct concat(H.USERID, H.EXAM_DT)) + count (distinct 
concat(J.USERID, J.EXAM_DT)) + count (distinct concat(M.USERID, M.EXAM_DT))
 + count (distinct concat(P.USERID, P.EXAM_DT)) + count (distinct 
concat(S.USERID, S.EXAM_DT)) + count (distinct concat(V.USERID, V.EXAM_DT)) 
 + count (distinct concat(Y.USERID, Y.EXAM_DT))       AS 'Total_Unique'
 FROM PS_JOB F

 LEFT OUTER JOIN  PS_GHS_HS_ANN_EXAM H ON  F.USERID = H.USERID AND 
  H.EMPL_RCD = F.EMPL_RCD
 LEFT OUTER JOIN  PS_GHS_HS_ANTINEO G ON  F.USERID = G.USERID AND G.EMPL_RCD 
  = F.EMPL_RCD
 LEFT OUTER JOIN  PS_GHS_HS_AUDIO J ON  F.USERID = J.USERID AND J.EMPL_RCD = 
  F.EMPL_RCD 
 LEFT OUTER JOIN  PS_GHS_HS_DOT M ON  F.USERID = M.USERID AND M.EMPL_RCD = 
  F.EMPL_RCD 
 LEFT OUTER JOIN  PS_GHS_HS_HAZMAT P ON  F.USERID = P.USERID AND P.EMPL_RCD 
  = F.EMPL_RCD  
 LEFT OUTER JOIN  PS_GHS_HS_PREPLACE S ON  F.USERID = S.USERID AND S 
  .EMPL_RCD = F.EMPL_RCD  
 LEFT OUTER JOIN  PS_GH_RESP_FIT V ON  F.USERID = V.USERID AND V.EMPL_RCD = 
  F.EMPL_RCD  
 LEFT OUTER JOIN  PS_GHS_HS_ASBESTOS Y ON  F.USERID = Y.USERID  

WHERE ( ( F.EFFDT = 
    (SELECT MAX(F_ED.EFFDT) FROM PS_JOB F_ED 
    WHERE F.USERID = F_ED.USERID 
      AND F.EMPL_RCD = F_ED.EMPL_RCD 
      AND F_ED.EFFDT <= SUBSTRING(CONVERT(CHAR,GETDATE(),121), 1, 10)) 
AND F.EFFSEQ = 
    (SELECT MAX(F_ES.EFFSEQ) FROM PS_JOB F_ES 
    WHERE F.USERID = F_ES.USERID 
      AND F.EMPL_RCD = F_ES.EMPL_RCD 
      AND F.EFFDT = F_ES.EFFDT) ))

从上面的查询中我得到的总数是42,而不是30.我查看了没有COUNT聚合的数据,它似乎检索了表中的空白行以及连接数据。

1 个答案:

答案 0 :(得分:2)

所以你想根据USERID和EXAM_DT的组合计算不同的数量?

但是count(distinct ...)只允许一个字段。

然后结合2个字段 你可以使用concat。

或替代方案。在日期上对em进行分组,然后对总数进行求和。

简化示例代码段:

declare @T table (id int identity(1,1) primary key, userid int, exam_dt datetime);

insert into @T (userid, exam_dt) values
(100, GETDATE()),(200, GETDATE()),(100, GETDATE()-1),(200, GETDATE()+0.001),(NULL,NULL);

select * from @T;

-- Method 1.1
select count(distinct concat(userid,'_',cast(exam_dt as date))) as total_unique from @T where userid is not null;

-- Method 1.2 : Adjustment because of the left joins. When there's no match then the values of the joined table would appear as NULL
select count(distinct nullif(concat(userid,'_',cast(exam_dt as date)),'_')) as total_unique from @T;

-- Method 2
select sum(total) as total_unique 
from(
    select count(distinct t.userid) as total 
    from @T t
    group by cast(t.exam_dt as date)
) q;

返回3.

因为用户标识100有2条具有不同日期的记录,因此计为2 虽然用户标识200有2条具有相同日期的记录,但因此计为1。

带连接的简化示例代码段:

declare @T  table (id int identity(1,1) primary key, userid int, empl_rcd int default 0, exam_dt date);
declare @F1 table (id int identity(1,1) primary key, userid int, empl_rcd int default 0, exam_dt date);
declare @F2 table (id int identity(1,1) primary key, userid int, empl_rcd int default 0, exam_dt date);
insert into @T  (userid, exam_dt) values (100, GETDATE()),(200, GETDATE()),(100, GETDATE()-1),(200, GETDATE()),(300, GETDATE());
insert into @F1 (userid, exam_dt) values (100, GETDATE()),(200, GETDATE()),(200, GETDATE()+1);
insert into @F2 (userid, exam_dt) values (100, GETDATE()),(300, GETDATE()+1),(300, GETDATE()+2);

select (total0 + total1 + total2) as total, q.*
from (
    select 
    count(distinct nullif(concat(t0.userid,'_',t0.exam_dt),'_')) as total0,
    count(distinct nullif(concat(f1.userid,'_',f1.exam_dt),'_')) as total1,
    count(distinct nullif(concat(f2.userid,'_',f2.exam_dt),'_')) as total2
    from @T t0
    left join @F1 f1 on (f1.userid = t0.userid and f1.empl_rcd = t0.empl_rcd)
    left join @F2 f2 on (f2.userid = t0.userid and f2.empl_rcd = t0.empl_rcd)
) q;