确定SQL中的完整vs半年数据集

时间:2008-10-13 01:39:45

标签: sql sql-server-2005 tsql

我有一个表格,其中包含两个感兴趣的字段:CHAR(3)ID和DATETIME。 ID标识数据的提交者 - 几千行。 DATETIME也不一定是唯一的。 (主键是表格的其他字段。)

此表的数据每六个月提交一次。 12月,我们收到每个提交者的7月至12月数据,6月份我们会收到7月至6月的数据。我的任务是编写一个脚本,用于识别仅提交了一半数据的人员,或仅在6月份提交的1月至6月数据。

......有没有人有解决方案?

3 个答案:

答案 0 :(得分:1)

从您的描述中,我不会担心查询的效率,因为显然它只需要每年运行两次!

有几种方法可以做到这一点,哪一个是“最好的”取决于您拥有的数据。您建议的日期值(在最大/最小日期值上)应该有效,另一种选择是只计算每个日期范围内提交的记录,例如

select * from (
    select T.submitterId,
        (select count(*) 
        from TABLE T1
        where T1.datefield between [july] and [december]
        and T1.submitterId = T.submitterId
        group by T1.submitterId) as JDCount,
        (select count(*)
        from TABLE T2
        where T2.datefield between [december] and [june]
        and T2.submitterId = T.submitterId
        group by T2.submitterId) as DJCount
    from TABLE T) X
where X.JDCount <= 0 OR X.DJCount <= 0

警告:未经检验的查询不在我的脑海中;你的里程可能会有所不同

答案 1 :(得分:1)

为了兴趣,这就是我最后使用的东西。这是基于斯蒂芬的回答,但有一些改编。我还没有赞美他的声誉。 :)

它是每六个月运行一个更大的脚本的一部分,但我们每12个月只检查一次 - 因此“If FullYear = 1”。我确信有更时尚的方法来确定边界日期,但这似乎有效。

IF @FullYear = 1 
    BEGIN
        DECLARE @FirstDate AS DATETIME
        DECLARE @LastDayFirstYear AS DATETIME
        DECLARE @SecondYear AS INT
        DECLARE @NewYearsDay AS DATETIME
        DECLARE @LastDate AS DATETIME

        SELECT @FirstDate = MIN(dscdate), @LastDate = MAX(dscdate)
        FROM    TheTable

        SELECT  @SecondYear = DATEPART(yyyy, @FirstDate) + 1
        SELECT  @NewYearsDay = CAST(CAST(@SecondYear AS VARCHAR) 
            + '-01-01' AS DATETIME)

        INSERT  INTO @AuditResults
                SELECT DISTINCT
                    'Submitter missing Jan-Jun data', t.id
                FROM    TheTable t
                WHERE   
                    EXISTS ( 
                        SELECT 1
                        FROM   TheTable t1
                        WHERE  t.id = t1.id 
                            AND t1.date >= @FirstDate
                            AND t1.date < @NewYearsDay )
                    AND NOT EXISTS ( 
                        SELECT 1
                        FROM   TheTable t2
                        WHERE  t2.date >= @NewYearsDay
                        AND t2.date <= @LastDate
                        AND t2.id = t.id
                        GROUP BY t2.id )
                GROUP BY t.id
    END

答案 2 :(得分:0)

我后来意识到我应该检查以确保两个 7月到12月以及1月到6月都有数据。所以这就是我在v2中的结论:

SELECT  @avgmonths = AVG(x.[count])
FROM    ( SELECT    CAST(COUNT(DISTINCT DATEPART(month,
                                                 DATEADD(month,
                                                         DATEDIFF(month, 0, dscdate),
                                                         0))) AS FLOAT) AS [count]
          FROM      HospDscDate
          GROUP BY  hosp
         ) x

IF @avgmonths > 7 
    SET @months = 12
ELSE 
    SET @months = 6


SELECT  'Submitter missing data for some months' AS [WarningType],
     t.id
FROM    TheTable t
WHERE   EXISTS ( SELECT 1
                 FROM   TheTable t1
                 WHERE  t.id = t1.id
                 HAVING COUNT(DISTINCT DATEPART(month,
                       DATEADD(month, DATEDIFF(month, 0, t1.Date), 0))) < @months )
GROUP BY t.id