如何在不重复查询的情况下计算不同的多个字段?

时间:2012-03-12 08:53:22

标签: sql count sql-server-2000 distinct

我有一个查询,其中包含几个返回每月计数的分组。像这样:

SELECT field1, field2, year(someDate), month(someDate), count(*) as myCount
FROM myTable
WHERE field5 = 'test'
GROUP BY field1, field2, year(someDate), month(someDate)

问题在于我希望计数每天都是不同的,基于id字段+日期字段(没有时间)。就像我一样,我希望每个月每天都能得到明显的ID数。所以我想要这样的东西:

SELECT field1, field2, year(someDate), month(someDate), 
       count(distinct someID, someDate) as myCount
FROM myTable
WHERE field5 = 'test'
GROUP BY field1, field2, year(someDate), month(someDate)

这有两个问题:

  1. 您无法为计数汇总列出2个不同的字段
  2. 这也将包括日期的时间,所以它不会过滤任何东西,因为它几乎总会有不同的时间
  3. 我可以通过转换为仅仅日期的varchar来轻松处理2.但我不确定如何处理多个不同的字段问题。我不能使用this solution,因为我不想重复整个where子句和group by子句。这就是我想出来的:

    SELECT field1, field2, year(someDate), month(someDate), 
           count(distinct someID + CONVERT(VARCHAR, someDate, 112)) as myCount
    FROM myTable
    WHERE field5 = 'test'
    GROUP BY field1, field2, year(someDate), month(someDate)
    

    我没有在逗号分隔的列表中列出不同的字段,而是简单地将它们连接起来。我应该注意这种方法有什么缺点吗?我能指望它准确吗?并且 - 有没有更好的方法来实现这一目标?

    基本上,我每个月都在分组,但“不同”的计数应该基于一天。如果,如果我在1月3日和1月5日有一个31,我希望它在1月份计为2,但如果我在1月3日有两次id,我只想要计算一次。

    一些基本的样本数据&预期输出(为此跳过field1和field2):

    *Date*              *ID*
    1/3/12 00:00:09     22
    1/3/12 00:13:00     22
    1/4/12 12:00:00     22
    1/7/12 15:00:45     27
    1/15/12 15:00:00    22
    2/6/12 00:00:09     50
    2/8/12 00:13:00     44
    2/8/12 12:00:00     45
    2/22/12 15:00:45    33
    2/22/12 15:00:00    33
    2/22/12 15:00:00    44
    
    *Year*  *Month* *Count*
    2012    Jan     4
    2012    Feb     5
    

2 个答案:

答案 0 :(得分:1)

<强>已更新

根据您的样本数据,这会得到所需的结果:

Declare @Tab table ([Date] datetime,ID int)
insert into @Tab([Date],ID) values
('2012-01-03T00:00:09.000', 22),
('2012-01-03T00:13:00.000', 22),
('2012-01-04T12:00:00.000', 22),
('2012-01-07T15:00:45.000', 27),
('2012-01-15T15:00:00.000', 22),
('2012-02-06T00:00:09.000', 50),
('2012-02-08T00:13:00.000', 44),
('2012-02-08T12:00:00.000', 45),
('2012-02-22T15:00:45.000', 33),
('2012-02-22T15:00:00.000', 33),
('2012-02-22T15:00:00.000', 44)

select DATEADD(month,DATEDIFF(month,0,[Date]),0) as MonthStart,SUM(distinctDayIDs)
from
(
    SELECT DATEADD(day,DATEDIFF(day,0,[Date]),0) as [Date], 
           count(distinct ID) as distinctDayIDs
    FROM @Tab
    --WHERE field5 = 'test'
    GROUP BY DATEADD(day,DATEDIFF(day,0,[Date]),0)
) t
group by DATEADD(month,DATEDIFF(month,0,[Date]),0)

我认为,因为我们必须对每一天进行计数,所以我们必须将其作为两个单独的分组操作。


旧答案

听起来,期望的输出是field1field2,日期和该日期不同ID的数量?

如果是这样,我认为你过于复杂了:

SELECT field1, field2, DATEADD(day,DATEDIFF(day,0,someDate),0) as Date, 
       count(distinct someID) as myCount
FROM myTable
WHERE field5 = 'test'
GROUP BY ffield1, field2, DATEADD(day,DATEDIFF(day,0,someDate),0)

(我正在使用DATEADD / DATEDIFF去除时间部分,而不是转换为varchar

答案 1 :(得分:0)

您可以尝试使用计数“over partition”:

SELECT 
   field1, field2, someID, someDate, 
   count(*) OVER(PARTITION BY someID, someDate) as myCount
FROM myTable
WHERE field5 = 'test'
GROUP BY field1, field2, someID, someDate

或准备CTE选择:

;with cte as (
   select someDate, count( someID) as myCount
   from myTable
   group by someDate)
 select m.field1, m.field2, m.someID, m.someDate, cte.myCount
 from myTable m inner join cte 
   on m.someDate = cte.someDate
 where ...