摆脱不必要的sql子选择的最佳方法是什么?

时间:2009-08-05 01:42:05

标签: tsql sql-server-2008

我有一个名为Registrations的表,其中包含以下字段:

  • 编号
  • DateStarted(非null)
  • DateCompleted(nullable)

我有一个条形图,显示按日期开始和完成的注册数量。 我的查询如下:

;
WITH Initial(DateStarted, StartCount)
as (
    select Datestarted, COUNT(*)
    FROM Registrations
    GROUP BY DateStarted    
)
select I.DateStarted, I.StartCount, COUNT(DISTINCT R.RegistrationId) as CompleteCount
    from Initial I
        inner join Registrations R
            ON (I.DateStarted = R.DateCompleted)
    GROUP BY I.DateStarted, I.StartCount

返回一个看起来像的表:

DateStarted  StartCount  CompleteCount
2009-08-01   1033        903
2009-08-02   540         498

查询只有其中一个代码异味问题。有什么更好的方法呢?

4 个答案:

答案 0 :(得分:1)

编辑:那么为什么下面的工作呢?如果你想使计数为零而不是null,你可以在最后一个select语句的计数周围抛出coalesce()语句。它还将包括已完成(或在下面的示例中结束)注册的日期,即使该日期尚未开始注册。


我假设以下表格结构(大致)。

create table temp
(
    id int,
    start_date datetime,
    end_date datetime
)

insert into temp values (1, '8/1/2009', '8/1/2009')
insert into temp values (2, '8/1/2009', '8/2/2009')
insert into temp values (3, '8/1/2009', null)
insert into temp values (4, '8/2/2009', '8/2/2009')
insert into temp values (5, '8/2/2009', '8/3/2009')
insert into temp values (6, '8/2/2009', '8/4/2009')
insert into temp values (7, '8/4/2009', null)

然后你可以做以下事情来获得你想要的东西。

with start_helper as
(
    select start_date, count(*) as count from temp group by start_date
),

end_helper as
(
    select end_date, count(*) as count from temp group by end_date
)

select coalesce(a.start_date, b.end_date) as date, a.count as start_count, b.count as end_count
from start_helper a full outer join end_helper b on a.start_date = b.end_date
where coalesce(a.start_date, b.end_date) is not null

我认为完整的外部联接是必要的,因为今天开始的记录可以在昨天开始,但我们今天可能还没有开始新记录,所以你会从结果中失去一天。

答案 1 :(得分:1)

副手,我认为这样做:

SELECT
    DateStarted
    , COUNT(*) as StartCount
    , SUM(CASE 
        WHEN DateCompleted = DateStated THEN 1
        ELSE 0 END
        ) as CompleteCount

FROM Registration

GROUP BY DateStarted

好的,显然我之前有过错误的要求。鉴于CompleteCounts独立于StartDate,那么我就是这样做的:

;WITH StartDays AS
(
    SELECT DateStarted
    , Count(*) AS CompleteCount 
    FROM Registration 
    GROUP BY DateStarted
)
, CompleteDays AS
(
    SELECT DateCompleted
    , Count(*) AS StartCount 
    FROM Registration 
    GROUP BY DateCompleted
)
SELECT
    DateStarted
    , COALESCE(StartCount, 0) AS StartCount
    , COALESCE(CompleteCount, 0) AS CompleteCount

FROM StartDays
FULL OUTER JOIN CompleteDays ON DateStarted = DateCompleted

这实际上非常接近你所拥有的。

答案 2 :(得分:0)

我没有看到问题。我看到正在使用一个公用表表达式。


您没有为表格提供DDL,因此我不会尝试重现这一点。但是,我认为您可以直接替换SELECT以使用Initial。

答案 3 :(得分:0)

我相信以下内容与您的功能相同:

select DS.DateStarted
  , count(distinct DS.RegistrationId) as StartCount
  , count(distinct DC.RegistrationId) as CompleteCount
from Registrations DS
inner join Registrations DC on DS.DateStarted = DC.DateCompleted
group by Ds.DateStarted

我对结果中DateStarted列的名称感到有些困惑。它看起来只是一个日期,一些事情开始,一些事情结束。计数是当天开始和完成的数量或注册。

内连接丢弃任何0开始或0完成的日期。为了得到所有:

select coalesce(DS.DateStarted, DC.DateCompleted) as "Date"
  , count(distinct DS.RegistrationId) as StartCount
  , count(distinct DC.RegistrationId) as CompleteCount
from Registrations DS
full outer join Registrations DC on DS.DateStarted = DC.DateCompleted
group by Ds.DateStarted, DC.DateCompleted

如果你想包括既不是DateStarted也不是DateCompleted的日期,计数为0和0,那么你需要一个日期来源,我认为在select子句中使用两个相关的子查询会更清楚加入和统计不同:

select DateSource."Date"
    , (select count(*)
        from Registrations
        where DateStarted = DateSource."Date") as StartCount
    , (select count (*)
        from Registrations
        where DateCompleted = DateSource."Datge") as CompleteCount
from DateSource -- implementation of date source left as exercise
where DateSource.Date between @LowDate and @HighDate