我有一个表格,其中包含两个感兴趣的字段:CHAR(3)ID和DATETIME。 ID标识数据的提交者 - 几千行。 DATETIME也不一定是唯一的。 (主键是表格的其他字段。)
此表的数据每六个月提交一次。 12月,我们收到每个提交者的7月至12月数据,6月份我们会收到7月至6月的数据。我的任务是编写一个脚本,用于识别仅提交了一半数据的人员,或仅在6月份提交的1月至6月数据。
......有没有人有解决方案?
答案 0 :(得分:1)
从您的描述中,我不会担心查询的效率,因为显然它只需要每年运行两次!
有几种方法可以做到这一点,哪一个是“最好的”取决于您拥有的数据。您建议的日期值(在最大/最小日期值上)应该有效,另一种选择是只计算每个日期范围内提交的记录,例如
select * from (
select T.submitterId,
(select count(*)
from TABLE T1
where T1.datefield between [july] and [december]
and T1.submitterId = T.submitterId
group by T1.submitterId) as JDCount,
(select count(*)
from TABLE T2
where T2.datefield between [december] and [june]
and T2.submitterId = T.submitterId
group by T2.submitterId) as DJCount
from TABLE T) X
where X.JDCount <= 0 OR X.DJCount <= 0
警告:未经检验的查询不在我的脑海中;你的里程可能会有所不同
答案 1 :(得分:1)
为了兴趣,这就是我最后使用的东西。这是基于斯蒂芬的回答,但有一些改编。我还没有赞美他的声誉。 :)
它是每六个月运行一个更大的脚本的一部分,但我们每12个月只检查一次 - 因此“If FullYear = 1”。我确信有更时尚的方法来确定边界日期,但这似乎有效。
IF @FullYear = 1
BEGIN
DECLARE @FirstDate AS DATETIME
DECLARE @LastDayFirstYear AS DATETIME
DECLARE @SecondYear AS INT
DECLARE @NewYearsDay AS DATETIME
DECLARE @LastDate AS DATETIME
SELECT @FirstDate = MIN(dscdate), @LastDate = MAX(dscdate)
FROM TheTable
SELECT @SecondYear = DATEPART(yyyy, @FirstDate) + 1
SELECT @NewYearsDay = CAST(CAST(@SecondYear AS VARCHAR)
+ '-01-01' AS DATETIME)
INSERT INTO @AuditResults
SELECT DISTINCT
'Submitter missing Jan-Jun data', t.id
FROM TheTable t
WHERE
EXISTS (
SELECT 1
FROM TheTable t1
WHERE t.id = t1.id
AND t1.date >= @FirstDate
AND t1.date < @NewYearsDay )
AND NOT EXISTS (
SELECT 1
FROM TheTable t2
WHERE t2.date >= @NewYearsDay
AND t2.date <= @LastDate
AND t2.id = t.id
GROUP BY t2.id )
GROUP BY t.id
END
答案 2 :(得分:0)
我后来意识到我应该检查以确保两个 7月到12月以及1月到6月都有数据。所以这就是我在v2中的结论:
SELECT @avgmonths = AVG(x.[count])
FROM ( SELECT CAST(COUNT(DISTINCT DATEPART(month,
DATEADD(month,
DATEDIFF(month, 0, dscdate),
0))) AS FLOAT) AS [count]
FROM HospDscDate
GROUP BY hosp
) x
IF @avgmonths > 7
SET @months = 12
ELSE
SET @months = 6
SELECT 'Submitter missing data for some months' AS [WarningType],
t.id
FROM TheTable t
WHERE EXISTS ( SELECT 1
FROM TheTable t1
WHERE t.id = t1.id
HAVING COUNT(DISTINCT DATEPART(month,
DATEADD(month, DATEDIFF(month, 0, t1.Date), 0))) < @months )
GROUP BY t.id