我创建了一个这样的表:
CREATE TABLE #TEMP(RecordDate datetime, First VARCHAR(255), Last VARCHAR(255), Value int)
INSERT INTO #TEMP VALUES('2011-03-01 00:00:00.000','john','smith','10')
INSERT INTO #TEMP VALUES('2011-03-01 00:00:00.000','john','adams','60')
INSERT INTO #TEMP VALUES('2011-03-01 00:00:00.000','john','resig','90')
INSERT INTO #TEMP VALUES('2011-03-01 00:00:00.000','john','balte','95')
INSERT INTO #TEMP VALUES('2011-03-01 01:00:00.000','john','smith','98')
INSERT INTO #TEMP VALUES('2011-03-01 01:00:00.000','john','adams','67')
INSERT INTO #TEMP VALUES('2011-03-01 01:00:00.000','john','resig','24')
INSERT INTO #TEMP VALUES('2011-03-01 01:00:00.000','john','balte','20')
SELECT * FROM #TEMP
DROP TABLE #TEMP
现在包含以下记录:
RecordDate First Last Value
2011-03-01 00:00:00.000 john smith 10
2011-03-01 00:00:00.000 john adams 60
2011-03-01 00:00:00.000 john resig 90
2011-03-01 00:00:00.000 john balte 95
2011-03-01 01:00:00.000 john smith 98
2011-03-01 01:00:00.000 john adams 67
2011-03-01 01:00:00.000 john resig 24
2011-03-01 01:00:00.000 john balte 20
我正在尝试获取如下表格:
RecordDate first Good Bad
2011-03-01 00:00:00.000 john 3 1
2011-03-01 01:00:00.000 john 2 2
我计算好与坏的方法是在特定日期采用名字为MAX
的所有人john
,然后将其作为特定日期的原始数据集的过滤器应用日期和名字。只有大于0.5*MAXValue
的值才会被视为Good
。
在结果表中,有3个不错的值,因为第一个日期的最大值为95
且只有60,90,95
大于0.5*95
,因此结果为{{1} }。在第二个结果中,同样地,它是(Good,Bad) = (3,1)
。
我的桌子足够大,有近3亿条记录,我无法理解从哪里开始有效地做到这一点。关于什么是有效方式的建议?
我目前的(工作但昂贵的)方法如下:
(2,2)
答案 0 :(得分:3)
你走了:
select
t.RecordDate,
COUNT(case
when t.Value > MV.MaxValue * 0.5 then 1
else null
end) Good,
COUNT(case
when t.Value <= MV.MaxValue * 0.5 then 1
else null
end) Bad
from #Temp t inner join
(select RecordDate, MAX(Value) MaxValue
from #Temp Group By RecordDate) MV on t.RecordDate = MV.RecordDate
Group by t.RecordDate
诀窍是创建一个派生表,其中包含每个记录日期的最大值,然后使用表本身INNER JOIN
创建一个派生表。一旦解决了最大值,就可以直接访问它们。
<强>更新强>
我看到你更新了你的问题,并在结果中包含了第一个名字。永远不要害怕,这是解决方案:
select
t.RecordDate,
t.First,
COUNT(case
when t.Value > MV.MaxValue * 0.5 then 1
else null
end) Good,
COUNT(case
when t.Value <= MV.MaxValue * 0.5 then 1
else null
end) Bad
from #Temp t inner join
(select RecordDate, First, MAX(Value) MaxValue
from #Temp Group By RecordDate, First) MV
on (t.RecordDate = MV.RecordDate and t.First = MV.First)
Group by t.RecordDate, t.First
答案 1 :(得分:1)
引用外部查询的嵌套查询可能会导致大量重复性工作。 这将只计算所有名称和日期的所有MAX:
SELECT RecordDate, FirstName, MAX(Value) FROM #TEMP GROUP BY RecordDate, FirstName
现在加入回原始数据:
SELECT A.RecordDate, A.FirstName,
SUM(CASE WHEN Value > MaxVal*0.5 THEN 1 ELSE 0 END) AS GOOD,
SUM(CASE WHEN Value > MaxVal*0.5 THEN 0 ELSE 1 END) AS BAD,
FROM #TEMP A INNER JOIN
(SELECT RecordDate, FirstName, MAX(Value) as MaxVal
FROM #TEMP GROUP BY RecordDate, FirstName) B
ON (A.RecordDate = B.RecordDate AND A.FirstName = B.FirstName)
GROUP BY A.RecordDate, A.FirstName