所以,我有一个问题:在有可能产品的情况下,过去12个月(计算正在处理的产品代码)获得实际的总和/平均值,通过产品代码进行分区一个月内不存在。
尝试从TL开始; DR,让我们去实际的事情:
我尝试过:
SELECT * FROM (
SELECT year, month, product,
AVG(value) OVER (
PARTITION BY product
ORDER BY year, month
ROWS 11 PRECEDING
) as average,
SUM(value) OVER (
PARTITION BY product
ORDER BY year, month
ROWS 11 PRECEDING
) as sum
FROM suchDB.muchUSER.awesomeTABLE
) q
where year = <insert year> and month = <month>
问题:
我们曾经使用'GROUP BY product'
和'WHERE ((year-1)*100)+month > queriedYear-1Month AND (year*100)+month <= queriedYearMonth'
查询此内容,直到有人指出我们处于OVER / PARTITION方向并且我们改变了所有内容......但即使这样我们也遇到了一些问题,例如{{1函数将忽略NULL月......
帮助?
忘记一件非常重要的事情
awesomeTABLE中的数据是版本化的 - 每年/每月可以有多个版本。只能使用最新版本。我通常通过加入avg()
来做到这一点,但它似乎会杀死一些可能的解决方案......
答案 0 :(得分:2)
最大的问题是,您需要获取两个列表才能正确汇总数据集中的值 - 一个日期列表和一个产品列表。如果没有这两个列表,最后一个月中缺少的产品意味着不会报告产品,或者(如您已发现的那样)可能会汇总错误的12个月(缺少7月意味着11个前置行包括开始月份)。
以下是对生成这些列表的过程的全面扩展探索。它只使用源数据表(假设每个月都出售某些。它可以更简洁(即计算上面shawnt示例中的日期),但写入显示所有我将它封装到一个存储过程中,因为它显示了显式传递的值。
CREATE PROCEDURE DoTheThing
@startDate DATE -- Should be Year-Month-01 or YearMonth01
AS
BEGIN
DECLARE @yr INT, @mth INT,
@yr2 INT, @mth2 INT,
@endDate DATE -- StartDt - 11 months
-- if the date will be passed in with a day other than 01, add code here to set the day on the passed date to 01
-- if only the high year and month are passed in, then create a @startDt value and continue.
SET @endDate = DATEADD(MONTH, -11, @startDate)
SELECT @yr = DATEPART(YEAR, @startDate),
@mth = DATEPART(MONTH, @startdate),
@yr2 = DATEPART(YEAR, @startDate),
@mth2 = DATEPART(MONTH, @startdate)
WITH mthYr AS (
SELECT DISTINCT
YEAR,
MONTH
FROM suchDB.muchUSER.awesomeTABLE -- Get the data from the source table
WHERE (
YEAR = @yr -- if in the passed-in year, then take all months less than or equal to the start month
AND MONTH <= @mth
)
OR (
YEAR = @yr2 -- if the period is Jan -- Dec in one year, this reiterates the above
AND MONTH >= @mth2 -- if not, select the months in the second year where the month is greater than or equal to the calculated month
)
),
prods AS (
SELECT DISTINCT product -- Return a list of products sold during the year.
FROM suchDB.muchUSER.awesomeTABLE smt
INNER JOIN mthYr
ON mthYr.YEAR = smt.YEAR
AND mthyr.MONTH = smt.MONTH
)
SELECT MAX(mthYr.YEAR) AS year, -- current report only shows passed in year/month value
MAX(mthYr.month) AS month,
prods.product,
AVG(ISNULL(VALUE, 0.00)) average, -- isNull value adds a zero into the list to be averaged
SUM(ISNULL(VALUE, 0.00)) SUM -- not really necessary, but no warnings about NULL values will be generated
FROM mthYr CROSS JOIN prods -- cross join of the two lists means all of the products sold will have a value for each month
LEFT JOIN suchDB.muchUSER.awesomeTABLE smt -- left join so missing productMonths will still be added in
ON smt.YEAR = mthYr.year
AND smt.MONTH = mthYr.month
AND prods.product = smt.product
ORDER BY smt.product
END
答案 1 :(得分:1)
如果你有一个产品表和一个日期表,我会把上面的查询加入,以便表示所有产品和所有日期,然后用你拥有的方法对结果求和。
您的第一颗子弹应该被解决,因为所有日期和所有产品都将被代表
我相信你的第二颗子弹是由“前面的第11行”真正获得过去的11个值而不管日期;这可以通过将0替换为当前缺失的月份/产品组合的null来解决,但是在左连接方法的情况下将提供。
答案 2 :(得分:1)
对于缺失的月份有很多可能的修复方法。这是一种不会过多改变原始查询的方法:
WITH ym as (
select
year(dateadd(month, n, cast('<year>' + '<month>' + '01' as date))) as y,
month(dateadd(month, n, cast('<year>' + '<month>' + '01' as date))) as m
from (values
(0), (-1), (-2), (-3), (-4), (-5),
(-6), (-7), (-8), (-9), (-10), (-11)
) ofs(n)
)
SELECT
y, m, p.product,
AVG(coalesce(value, 0)) OVER (
PARTITION BY p.product
ORDER BY y, m
ROWS 11 PRECEDING
) as average,
SUM(value) OVER (
PARTITION BY p.product
ORDER BY y, m
ROWS 11 PRECEDING
) as sum
FROM
ym cross join
(select distinct product from suchDB.muchUSER.awesomeTABLE) p
left outer join suchDB.muchUSER.awesomeTABLE t
on t."year" = ym.y and t."month" = ym.m and t.product = p.product
where y = <insert year> and m = <month>
而不是最后的where
,您可以order by y, m desc
和select top 1
。我通常会尽量避免使用top
,但我不确定你是如何放弃参数的,并且避免使用某些编程库两次引用它们可能不那么麻烦。即使你手工完成,你仍然需要记住在长时间的查询中查看两个地方。
由于它似乎只需要一个月的聚合,我认为你可以使用这个不使用窗口函数的简单版本:
WITH ym as (
select
dateadd(month, n, cast('<year>' + '<month>' + '01' as date) as dt,
year(dateadd(month, n, cast('<year>' + '<month>' + '01' as date))) as y,
month(dateadd(month, n, cast('<year>' + '<month>' + '01' as date))) as m
from (values
(0), (-1), (-2), (-3), (-4), (-5),
(-6), (-7), (-8), (-9), (-10), (-11)
) ofs(n)
)
SELECT
year(max(dt)) as "year", month(max(dt)) as "month", p.product,
AVG(coalesce(value, 0)) as average,
SUM(value) as "sum"
FROM
ym cross join
(select distinct product from awesome) p
left outer join awesome t
on t."year" = ym.y and t."month" = ym.m and t.product = p.product
GROUP BY p.product
我不知道你是否只想限制去年实际销售的产品,所以我没有处理那些产品。
如果你开始考虑如何使它更通用和可重复使用,你可能会更像这个。我继续将产品限制添加到前一年的活动中:
WITH dt as (
select dateadd(month, n, cast('<year>' + '<month>' + '01' as date) as dt
),
ym as (
select dt, year(dateadd(month, n, dt)) as y, month(dateadd(month, n, dt)) as m
from (values
(0), (-1), (-2), (-3), (-4), (-5),
(-6), (-7), (-8), (-9), (-10), (-11)
) ofs(n), dt
)
SELECT
year(max(dt)) as "year", month(max(dt)) as "month", p.product,
AVG(coalesce(value, 0)) as average, SUM(value) as "sum"
FROM
ym cross join
(
select distinct product from awesome
where cast("year" + "month" + '01' as date) between
(select min(dt) from ym) and (select max(dt) from ym)
) p
left outer join (
select distinct /* get the latest "version" only */
first_value("year")
over (partition by "year", "month", product order by version desc),
first_value("month")
over (partition by "year", "month", product order by version desc),
product,
first_value(value)
over (partition by "year", "month", product order by version desc)
from awesome
) t
on t."year" = ym.y and t."month" = ym.m and t.product = p.product
GROUP BY p.product
最终查询还会尝试仅针对最新版本处理您的过滤器。但是,first_value()
函数需要更高版本的SQL Server。
答案 3 :(得分:1)
使用日期时要做的第一件事是拥有一个日期时间字段,而不是年,月,日的分隔字段
SELECT year, month, product, value
, DATEFROMPARTS(year, month, 1) fullDate
FROM suchDB.muchUSER.awesomeTABLE
由于行可以进行版本控制,我们需要获取每年,每月和每月产品的最新版本,这可以通过多种方式完成,例如使用窗口函数或自动连接,例如后者是
SELECT b.[year], b.[month], b.[product], [value]
, DATEFROMPARTS(b.[year], b.month, 1) fullDate
FROM suchDB.muchUSER.awesomeTABLE b
INNER JOIN (SELECT [year], [month], [product], max([version]) lv
FROM suchDB.muchUSER.awesomeTABLE
GROUP BY [year], [month], [product]
) m ON b.[year] = m.year AND b.month = m.month
AND b.product = m.product AND b.[version] = m.lv
同时拥有产品清单会很方便
SELECT DISTINCT product
FROM suchDB.muchUSER.awesomeTABLE
获取最后一年的数据可以通过多种方式完成,我喜欢CROSS APPLY
Declare @_year int
Declare @_month int
Set @_year = 2015
Set @_month = 11
;With baseDate AS (
SELECT b.[year], b.[month], b.[product], [value]
, DATEFROMPARTS(b.[year], b.month, 1) fullDate
FROM suchDB.muchUSER.awesomeTABLE b
INNER JOIN (SELECT [year], [month], [product], max([version]) lv
FROM suchDB.muchUSER.awesomeTABLE
GROUP BY [year], [month], [product]
) m ON b.[year] = m.year AND b.month = m.month
AND b.product = m.product AND b.[version] = m.lv
), Products AS (
SELECT DISTINCT [product]
FROM suchDB.muchUSER.awesomeTABLE
)
SELECT @_year [Year], @_month [Month], p.[product]
, ly.Average
, ly.[Sum]
FROM Products p
CROSS APPLY (SELECT Sum(lastYear.Value) / 12.0 Average
, Sum(lastYear.Value) [Sum]
FROM baseDate lastYear
WHERE lastYear.fullDate > DATEFROMPARTS(@_year - 1, @_month, 1)
AND lastYear.fullDate <= DATEFROMPARTS(@_year, @_month, 1)
AND lastYear.product = p.product
) ly
WHERE ly.[Sum] IS NOT NULL
WHERE
条件会删除参数
要删除变量并获取每月的值,需要一个日历表。如果表格中的所有产品都在那里,我们可以使用DISTINCT
获取日期,使用第一个CTE
代替表格,让我们也获得完整的日期
SELECT DISTINCT [year], [month], fullDate
FROM baseDate
否则,有不同的方法来创建日历表。我们可以将日历表添加到主查询CTE
中,并在CROSS APPLY
中使用它而不是变量
;With baseDate AS (
SELECT b.[year] _y, b.[month] _m, b.[product], [value]
, DATEFROMPARTS(b.[year], b.month, 1) fullDate
FROM suchDB.muchUSER.awesomeTABLE b
INNER JOIN (SELECT [year], [month], [product], max([version]) lv
FROM suchDB.muchUSER.awesomeTABLE
GROUP BY [year], [month], [product]
) m ON b.[year] = m.year AND b.month = m.month
AND b.product = m.product AND b.[version] = m.lv
), Products AS (
SELECT DISTINCT [product]
FROM suchDB.muchUSER.awesomeTABLE
), Months As (
SELECT DISTINCT _y, _m, fullDate
FROM baseDate
)
SELECT _y [Year], _m [Month], p.[product]
, ly.Average
, ly.[Sum]
, ly.[Count]
FROM Products p
CROSS APPLY (SELECT m._y, m._m
, Sum(bd.Value) / 12.0 Average
, Sum(bd.Value) [Sum]
, Count(Value) [Count]
FROM Months m
LEFT JOIN baseDate bd
ON bd.fullDate > DATEADD(YY, -1, m.fullDate)
AND bd.fullDate <= m.fullDate
WHERE bd.product = p.product
GROUP BY m._y, m._m
) ly
WHERE ly.[Sum] IS NOT NULL