我想从表中创建一些聚合,但我无法找到解决方案。
示例表:
DECLARE @MyTable TABLE(person INT, the_date date, the_value int)
INSERT INTO @MyTable VALUES
(1,'2017-01-01', 10),
(1,'2017-02-01', 5),
(1,'2017-03-01', 5),
(1,'2017-04-01', 10),
(1,'2017-05-01', 2),
(2,'2017-04-01', 10),
(2,'2017-05-01', 10),
(2,'2017-05-01', 0),
(3,'2017-01-01', 2)
对于当时存在的每个人,我想在给定一些开始日期(@start_date)的情况下平均最后x(@months_back)个月的值:
DECLARE @months_back int, @start_date date
set @months_back = 3
set @start_date = '2017-05-01'
SELECT person, avg(the_value) as avg_the_value
FROM @MyTable
where the_date <= @start_date and the_date >= dateadd(month, -@months_back, @start_date)
group by person
这很有效。我现在想再次做同样的事情但是从开始日期开始跳过几个月(@month_skip)。然后我想将这两张桌合并在一起。然后,我再次想要从这个日期开始跳过@month_skip几个月并做同样的事情。我想继续这样做,直到我跳过某个指定的日期(@min_date)。
DECLARE @months_back int, @month_skip int, @start_date date, @min_date date
set @months_back = 3
set @month_skip = 2
set @start_date = '2017-05-01'
set @min_date = '2017-03-01'
使用上述变量和表@MyTable,结果应为:
person | avg_the_value
1 | 5
2 | 6
1 | 6
3 | 2
这里只有一次跳过,因为@min_date是2个月后但是我希望能够根据@min_date进行多次跳过。
这个示例表很简单,但真实的表有更多自动创建的列,因此使用表变量是不可行的,我必须声明结果表的方案。
我问了一个相关问题Here,但未能找到解决此问题的任何答案。
答案 0 :(得分:0)
听起来你要做的就是以下内容:
从日期开始(例如2017-05-01),回顾@months_back
个月并定义一系列日期。例如,如果我们回溯3个月,我们将定义从2017-02-01到2017-05-01的范围。
在我们定义此范围后,我们会回到开始日期并定义新开始日期,然后返回@month_skip
个月。例如,初次开始日期为2017-05-01,我们可能会跳过2个月,为我们提供2017-03-01的新开始日期。
我们采用这个新的开始日期,并定义一系列相应的日期(如上所述)。这将产生2016-12-01至2017-03-01的范围。
我们会根据需要在指定的最短日期重复此操作,以生成我们要为其计算的日期范围列表:
2017-03-01 through 2017-05-01
2016-12-01 through 2017-03-01
... etc ...
对于每个期间,查看一个人并计算其平均值。
下面的查询应该执行上面描述的操作:我们使用数字表来计算间隔的偏移量,而不是取值并迭代以计算先前的值,用于确定每个值的结束日期和开始日期。间隔/周期。此查询是使用SQL Server 2008 R2构建的,应与未来版本兼容。
/* Table, data, variable declarations */
DECLARE @MyTable TABLE(person INT, the_date date, the_value int)
INSERT INTO @MyTable VALUES
(1,'2017-01-01', 10),
(1,'2017-02-01', 5),
(1,'2017-03-01', 5),
(1,'2017-04-01', 10),
(1,'2017-05-01', 2),
(2,'2017-04-01', 10),
(2,'2017-05-01', 10),
(2,'2017-05-01', 0),
(3,'2017-01-01', 2)
DECLARE @months_back int, @month_skip int, @start_date date, @min_date date
set @months_back = 3
set @month_skip = 2
set @start_date = '2017-05-01'
set @min_date = '2017-01-01'
/* Common table expression to build list of Integers */
/* reference http://www.itprotoday.com/software-development/build-numbers-table-you-need if you want more info */
declare @end_int bigint = 50
; WITH IntegersTableFill (ints) AS
(
SELECT
CAST(0 AS BIGINT) AS 'ints'
UNION ALL
SELECT (T.ints + 1) AS 'ints'
FROM IntegersTableFill T
WHERE ints <= (
CASE
WHEN (@end_int <= 32767) THEN @end_int
ELSE 32767
END
)
)
/* What we're going to do is define a series of periods.
These periods have a start date and an end date, and will simplify grouping
(in place of the calculate-and-union approach)
*/
/* Now, we start defining the periods
@months_Back_start defines the end of the range we need to calculate for.
@month_skip defines the amount of time we have to jump back for each period
*/
/* Using the number table we defined above and the data in our variables, calculate start and end dates */
,periodEndDates as
(
select ints as Period
,DATEADD(month, -(@months_back*ints), @start_date) as endOfPeriod
from IntegersTableFill itf
)
,periodStartDates as
(
select *
,DATEADD(month, -(@month_skip), endOfPeriod) as startOfPeriod
from periodEndDates
)
,finalPeriodData as
(
select (period) as period, startOfPeriod, endOfPeriod from periodStartDates
)
/* Link the entries in our original data to the periods they fall into */
/* NOTE: The join criteria originally specified allows values to fall into multiple periods.
You may want to fix this?
*/
,periodTableJoin as
(
select * from finalPeriodData fpd
inner join @MyTable mt
on mt.the_date >= fpd.startOfPeriod
and mt.the_date <= fpd.endOfPeriod
and mt.the_date >= @min_date
and mt.the_date <= @start_date
)
/* Calculate averages, grouping by period and person */
,periodValueAggregate as
(
select person, avg(the_value) as avg_the_value from
periodTableJoin
group by period, person
)
select * from periodValueAggregate
答案 1 :(得分:0)
我建议的方法是基于集合的,而不是迭代的。 (我不是完全按照您的问题,但请跟进,我们可以解决任何差异) 从本质上讲,您希望将日历划分为感兴趣的时段。周期宽度相等且是连续的。 为此,我建议您构建一个日历表,并使用分区标记句点,如代码所示;
DECLARE @CalStart DATE = '2017-01-01'
,@CalEnd DATE = '2018-01-01'
,@CalWindowSize INT = 2
;WITH Numbers AS
(
SELECT TOP (DATEDIFF(MONTH, @CalStart, @CalEnd)) N = CAST(ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS INT) - 1
FROM syscolumns
)
SELECT CalWindow = N / @CalWindowSize
,CalDate = DATEADD(MONTH, N, @CalStart)
FROM Numbers
正确配置变量后,您应该有一个代表感兴趣窗口的日历。
然后,将此日历粘贴到您的数据集并将其分组的问题不仅仅是person
,还有CalWindow
;
DECLARE @MyTable TABLE(person INT, the_date date, the_value int)
INSERT INTO @MyTable VALUES
(1,'2017-01-01', 10),
(1,'2017-02-01', 5),
(1,'2017-03-01', 5),
(1,'2017-04-01', 10),
(1,'2017-05-01', 2),
(2,'2017-04-01', 10),
(2,'2017-05-01', 10),
(2,'2017-05-01', 0),
(3,'2017-01-01', 2)
----------------------------------
-- Build Calendar
----------------------------------
DECLARE @CalStart DATE = '2017-01-01'
,@CalEnd DATE = '2018-01-01'
,@CalWindowSize INT = 2
;WITH Numbers AS
(
SELECT TOP (DATEDIFF(MONTH, @CalStart, @CalEnd)) N = CAST(ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS INT) - 1
FROM syscolumns
)
,Calendar AS
(
SELECT CalWindow = N / @CalWindowSize
,CalDate = DATEADD(MONTH, N, @CalStart)
FROM Numbers
)
SELECT TB.Person
,AVG(TB.the_value)
FROM @MyTable TB
JOIN Calendar CL ON TB.the_date = CL.CalDate
GROUP BY CL.CalWindow, TB.person
希望我理解你的问题。