我正在尝试查找组中的第一个值和最后一个值。 像First([Open]),Max([High]),Min([Low]),Last([Close])
下面是其中一个查询(目前缺少打开/关闭列的逻辑。数据集非常大(超过1.5亿条记录),因此查询性能可能会成为一个问题。
Select 'AUDCHF' AS CURRENCY,
Datepart(year, Datekey) AS [YEAR],
Datepart(month, Datekey) AS [MONTH],
Datepart(day, Datekey) AS [DAY],
Case When Datepart(hour, Datekey) BETWEEN 0 AND 11 Then 'AM' Else 'PM' End AS [12 Hour],
Case
When Datepart(hour, Datekey) BETWEEN 0 AND 3 Then '1st 4 Hours'
When Datepart(hour, Datekey) BETWEEN 4 AND 7 Then '2nd 4 Hours'
When Datepart(hour, Datekey) BETWEEN 8 AND 11 Then '3rd 4 Hours'
When Datepart(hour, Datekey) BETWEEN 12 AND 15 Then '4th 4 Hours'
When Datepart(hour, Datekey) BETWEEN 16 AND 19 Then '5th 4 Hours'
Else '6th 4 Hours'
End AS [4 Hours],
Datepart(hour, Datekey) AS [HOUR],
max(High) AS HIGH,
min(Low) AS LOW
From AUDCHF
Group by Datepart(year, Datekey), Datepart(month, Datekey), Datepart(day, Datekey),
Case When Datepart(hour, Datekey) BETWEEN 0 AND 11 Then 'AM' Else 'PM' End,
Case
When Datepart(hour, Datekey) BETWEEN 0 AND 3 Then '1st 4 Hours'
When Datepart(hour, Datekey) BETWEEN 4 AND 7 Then '2nd 4 Hours'
When Datepart(hour, Datekey) BETWEEN 8 AND 11 Then '3rd 4 Hours'
When Datepart(hour, Datekey) BETWEEN 12 AND 15 Then '4th 4 Hours'
When Datepart(hour, Datekey) BETWEEN 16 AND 19 Then '5th 4 Hours'
Else '6th 4 Hours'
End,
Datepart(hour, Datekey)
Order by Datepart(year, Datekey), Datepart(month, Datekey), Datepart(day, Datekey),
Case When Datepart(hour, Datekey) BETWEEN 0 AND 11 Then 'AM' Else 'PM' End,
Case
When Datepart(hour, Datekey) BETWEEN 0 AND 3 Then '1st 4 Hours'
When Datepart(hour, Datekey) BETWEEN 4 AND 7 Then '2nd 4 Hours'
When Datepart(hour, Datekey) BETWEEN 8 AND 11 Then '3rd 4 Hours'
When Datepart(hour, Datekey) BETWEEN 12 AND 15 Then '4th 4 Hours'
When Datepart(hour, Datekey) BETWEEN 16 AND 19 Then '5th 4 Hours'
Else '6th 4 Hours'
End,
Datepart(hour, Datekey)
答案 0 :(得分:1)
ORDER BY
可以使用SELECT表达式列表中定义的别名,因为它是
在SELECT部分之后进行评估(这不是GROUP BY部分的情况)。
在您的查询中,order by子句可以是:
Order by [YEAR], [MONTH], [DAY], [4 Hours],[HOUR]
由于您按年/月/日/ 4小时/ 4小时进行分组,我认为您可以删除4小时部分。
我会使用窗口函数并使用GROUP BY执行外部选择以删除重复项。
select [YEAR], [MONTH], [DAY], [HOUR], [12 Hour], [4 Hours],
max([HIGH]) as HIGH, min([LOW]) as LOW,
max([Open]) as [Open], max([Close]) as [Close]
from (
select
Datepart(year, Datekey) AS [YEAR],
Datepart(month, Datekey) AS [MONTH],
Datepart(day, Datekey) AS [DAY],
Datepart(hour, Datekey) AS [HOUR],
Case When Datepart(hour, Datekey) BETWEEN 0 AND 11 Then 'AM' Else 'PM' End AS [12 Hour],
Case
When Datepart(hour, Datekey) BETWEEN 0 AND 3 Then '1st 4 Hours'
When Datepart(hour, Datekey) BETWEEN 4 AND 7 Then '2nd 4 Hours'
When Datepart(hour, Datekey) BETWEEN 8 AND 11 Then '3rd 4 Hours'
When Datepart(hour, Datekey) BETWEEN 12 AND 15 Then '4th 4 Hours'
When Datepart(hour, Datekey) BETWEEN 16 AND 19 Then '5th 4 Hours'
Else '6th 4 Hours'
End AS [4 Hours],
max(High) over(
partition by
Datepart(year, Datekey) ,
Datepart(month, Datekey) ,
Datepart(day, Datekey),
Datepart(hour, Datekey)
) as [HIGH],
min(Low) over(
partition by
Datepart(year, Datekey) ,
Datepart(month, Datekey),
Datepart(day, Datekey),
Datepart(hour, Datekey)
) as [LOW],
first_value([Open]) over(
partition by
Datepart(year, Datekey) ,
Datepart(month, Datekey),
Datepart(day, Datekey),
Datepart(hour, Datekey)
order by
Datepart(year, Datekey) ,
Datepart(month, Datekey),
Datepart(day, Datekey),
Datepart(hour, Datekey)
) as [Open],
last_value([Close]) over(
partition by
Datepart(year, Datekey) ,
Datepart(month, Datekey),
Datepart(day, Datekey),
Datepart(hour, Datekey)
order by
Datepart(year, Datekey) ,
Datepart(month, Datekey),
Datepart(day, Datekey),
Datepart(hour, Datekey)
) as [Close]
from AUDCHF ) T
group by [YEAR], [MONTH], [DAY], [HOUR], [12 Hour], [4 Hours]
外部最大(高),最小(低)等在这里只是为了让GROUP BY感到满意,因为它们已经在内部选择中被处理,所以它们在这里并不是真正有意义的(我不知道Open和Close是什么所以我只使用相同的分区放置第一个和最后一个值。)
如果此查询必须在大表上运行,并且因为没有要减少的WHERE子句 选中的行,我会在Datekey上创建一个索引,包括High和Low列(以及查询中不包含的其他列:Close等),以避免完整的表扫描。它将导致完整的索引扫描,这可能会快得多:
create nonclustered index IxAudchf on AUDCHF(Datekey) include( [High], [Low], [Open], [Close]) ;
对于Sql Window功能,您可以找到演示文稿here 和here
注意:FIRST_VALUE和LAST_VALUE仅为Sql2012,而非2008年。
如果您运行的是SQL 2005或2008,则以下内容应该相同(可能效率较低)。我在最后一行接受了Low和Close,我不确定它是你想要的,如果我误解了,就改变它以遵循你的逻辑。
; WITH
WAUDCHF1 as
( select
row_number() over(
partition by
Datepart(year, Datekey), Datepart(month, Datekey) ,
Datepart(day, Datekey), Datepart(hour, Datekey)
order by Datepart(year, Datekey) , Datepart(month, Datekey) ,
Datepart(day, Datekey), Datepart(hour, Datekey)
) as [Rownum],
Datepart(year, Datekey) AS [YEAR],
Datepart(month, Datekey) AS [MONTH],
Datepart(day, Datekey) AS [DAY],
Datepart(hour, Datekey) AS [HOUR],
Case When Datepart(hour, Datekey) BETWEEN 0 AND 11 Then 'AM' Else 'PM' End AS [12 Hour],
Case
When Datepart(hour, Datekey) BETWEEN 0 AND 3 Then '1st 4 Hours'
When Datepart(hour, Datekey) BETWEEN 4 AND 7 Then '2nd 4 Hours'
When Datepart(hour, Datekey) BETWEEN 8 AND 11 Then '3rd 4 Hours'
When Datepart(hour, Datekey) BETWEEN 12 AND 15 Then '4th 4 Hours'
When Datepart(hour, Datekey) BETWEEN 16 AND 19 Then '5th 4 Hours'
Else '6th 4 Hours'
End AS [4 Hours],
max(High) over(
partition by
Datepart(year, Datekey) , Datepart(month, Datekey) ,
Datepart(day, Datekey), Datepart(hour, Datekey)
) as [HIGH],
min(Low) over(
partition by
Datepart(year, Datekey) , Datepart(month, Datekey),
Datepart(day, Datekey), Datepart(hour, Datekey)
) as [LOW],
[Open],
[Close]
from AUDCHF ),
LASTROWNUM as (
select [YEAR], [MONTH], [DAY], [HOUR], max(rownum) as [Rownum]
from WAUDCHF1
group by [YEAR], [MONTH], [DAY], [HOUR], [12 Hour], [4 Hours]
)
select W1.[YEAR], W1.[MONTH], W1.[DAY], W1.[HOUR],
max(W1.[High]) as [High], min(W2.[Low]) as [Low],
max(W1.[Open]) as [Open], max(w2.[Close]) as [Close]
from LASTROWNUM M
inner join WAUDCHF1 W1 on M.[YEAR] = W1.[YEAR]
and M.[MONTH]= W1.[MONTH]
and M.[DAY] = W1.[DAY]
and M.[HOUR] = W1.[HOUR]
inner join WAUDCHF1 W2 on W2.[YEAR] = M.[YEAR]
and W2.[MONTH]= M.[MONTH]
and W2.[DAY] = M.[DAY]
and W2.[HOUR] = M.[HOUR]
and W2.Rownum = M.Rownum
Where W1.Rownum = 1
group by W1.[YEAR], W1.[MONTH], W1.[DAY], W1.[HOUR], w1.[12 Hour], W1.[4 Hours]
order by W1.[YEAR], W1.[MONTH], W1.[DAY], W1.[HOUR], w1.[12 Hour], W1.[4 Hours]
答案 1 :(得分:0)
查询:
SELECT 'AUDCHF' AS CURRENCY,
Datepart(year, Datekey) AS [YEAR], Datepart(month, Datekey) AS [MONTH],
Datepart(day, Datekey) AS [DAY], [12 Hour], [4 Hours],
Datepart(hour, Datekey) AS [HOUR], High AS HIGH, Low AS LOW,
(SELECT High FROM Rate AS R WHERE R.Datekey = (SELECT MIN(Datekey)
FROM Rate WHERE DATEADD(hour, DATEDIFF(hour, 0, Rate.Datekey), 0) =
AUDCHF.Datekey AND Rate.Base = 'AUD' AND Rate.Target = 'CHF')
AND R.Base = 'AUD' AND R.Target = 'CHF') AS [Open],
(SELECT Low FROM Rate AS R WHERE R.Datekey = (SELECT MAX(Datekey)
FROM Rate WHERE DATEADD(hour, DATEDIFF(hour, 0, Rate.Datekey), 0) =
AUDCHF.Datekey AND Rate.Base = 'AUD' AND Rate.Target = 'CHF')
AND R.Base = 'AUD' AND R.Target = 'CHF') AS [Close]
FROM AUDCHF, Segment
WHERE Segment.Hour = Datekey
ORDER BY Datepart(year, Datekey), Datepart(month, Datekey),
Datepart(day, Datekey), Datepart(hour, Datekey);
将返回您期望的结果。我还将案例陈述提取到支持表中,您可以在SQLFiddle看到。提取还提供了对某些测试数据的查询结果。这使用T-SQL datetime rounded to nearest minute and nearest hours with using functions中的答案将时间截断为几小时。
基本上,视图AUDCHF会转换截断Datekey并执行分组。然后,查询将其与Segment表连接以提取常量字符串并计算初始值和最终值。这些需要在子查询中,因为它们与聚合无关。
当然,您需要在表上包含索引以保持性能。如果您不在主表中保留其他数据,或创建自定义索引,则应缓存大部分数据。
由于数据都是历史数据,您还可以准备物化视图以便快速参考。
货币对处理是部分的,在顶级视图中可以更好地处理,以避免重复的常量。它显示了如何将费率合并到一个表中以简化添加新的费率对。