带联接的SQL计数返回重复项

时间:2020-04-18 17:57:25

标签: sql sql-server

此处的Noob查询,试图获取子表中满足某些条件的唯一计数,如下例:

select distinct wi.* 
from wf.WorkflowInstance wi 
inner join wf.WorkflowInstanceDocument wid on wid.WorkflowInstanceId = wi.Id
where wid.ReceivedDateTime between '1/1/2020' and '1/30/2020'

这很好,我得到了唯一的记录。但是,我真正想要的是一个计数,因此,我有了以下内容,现在无论子表中有多少“文档”,我都会得到重复。

select distinct count(*) 
from wf.WorkflowInstance wi 
inner join wf.WorkflowInstanceDocument wid on wid.WorkflowInstanceId = wi.Id
where wid.ReceivedDateTime between '1/1/2020' and '1/30/2020'

我想count(wi.*)是可以做到的,但是语法不起作用,也许有一种不同的方式来联接/查询/分组这个给我我想要的。任何帮助都会很棒。谢谢!

2 个答案:

答案 0 :(得分:0)

COUNT(DISTINCT fields)确实是您想要的,但是您也可以使用Common Table Expression(临时查询结果,可用于随后的下一条语句)来完成此操作。

WITH cte AS
(
select distinct wi.* from wf.WorkflowInstance wi 
inner join wf.WorkflowInstanceDocument wid on wid.WorkflowInstanceId = wi.Id
where wid.ReceivedDateTime between '1/1/2020' and '1/30/2020'
)
SELECT COUNT(*) FROM cte

您也可以这样做:

SELECT COUNT(*)
FROM (select distinct wi.* from wf.WorkflowInstance wi 
inner join wf.WorkflowInstanceDocument wid on wid.WorkflowInstanceId = wi.Id
where wid.ReceivedDateTime between '1/1/2020' and '1/30/2020') x

答案 1 :(得分:0)

可能最有效的方法是使用exists

select count(*)
from wf.WorkflowInstance wi 
where exists (select 1
              from wf.WorkflowInstanceDocument wid 
              where wid.WorkflowInstanceId = wi.Id and
                    wid.ReceivedDateTime >= '2020-01-01 and
                    wid.ReceivedDateTime < '2020-01-31 
             );

(非常合理的)假设是由join创建重复项。如果您只想花更多的精力去删除重复的记录,为什么还要麻烦呢?

还要注意,我对日期比较做了一些更改。首先,这对日期常数使用标准格式。其次,它使用between>=代替<。这样即使在有时间成分的情况下,也可以确保逻辑正确。