在计算中值移动平均单位成本时使用Over(Partition By)

时间:2016-07-27 20:27:18

标签: sql sql-server-2014 window-functions median

早上好, 我试图计算特定仓库中每个项目的12个月移动平均成本(MAUC)。我正在使用2012_B - 分页技巧来计算中位数价格(http://sqlperformance.com/2012/08/t-sql-queries/median)而不是使用AVG,以消除异常值导致结果偏差的可能性。

以下代码有效,但它只计算一个项目或所有项目的MAUC - 取决于我是删除还是保留“AND t_item ='xxxxx'

WITH Emily AS 

(SELECT 

 t_item AS [Item Code]
,t_mauc_1 AS [MAUC]

FROM twhina113100
WHERE t_cwar = '11'

AND t_item = '         TNC-C2050NP-G'

AND t_trdt > GETDATE()-365)


(SELECT 

AVG(1.0 * [Valuation Table].[MAUC])

FROM (
    SELECT [MAUC] FROM Emily
     ORDER BY [Emily].[MAUC]

     OFFSET ((SELECT COUNT(*) FROM Emily) - 1) / 2 ROWS
     FETCH NEXT 1 + (1 - (SELECT COUNT(*) FROM Emily) % 2) ROWS ONLY

)  AS [Valuation Table] )

我相信使用Over(Partition By)可以帮助我按t_item进行分区,但是我不知道将它插入代码的位置。我对SQL很陌生,而且缺乏正式的培训正在开始显示。

如果您有任何其他建议,请分享。

非常感谢任何帮助!

1 个答案:

答案 0 :(得分:2)

这引起了我的注意,所以我发布了两个选项:

第一种是直接cte方法,第二种是使用临时表。 cte方法适用于较小的数据集,但随着系列的扩展,性能会受到影响。

两个选项都将计算数据系列的RUNNING Min,Max,Mean,Median和Mode

在我们进入之前,只需要几个项目。标准化结构是ID和Measure   - 身份证可以是日期或身份   - Measure是任何数值   - 中位数是排序系列的中间值。如果偶数个观察我们返回两个中间记录的平均值   - 模式表示为ModeR1和ModeR2。如果没有重复值,我们会显示最小/最大范围

  

好的,让我们来看看cte方法

Declare @Table table (ID Int,Measure decimal(9,2))
Insert into @Table (ID,Measure) values
(1,25),
(2,75),
(3,50),
(4,25),
(5,12),
(6,66),
(7,45)

;with cteBase as (Select *,RowNr = Row_Number() over (Order By ID) From  @Table),
      cteExpd as (Select A.*,Measure2 = B.Measure,ExtRowNr = Row_Number() over (Partition By A.ID Order By B.Measure) From cteBase A Join cteBase B on (B.RowNr<=A.RowNr)),
      cteMean as (Select ID,Mean=Avg(Measure2),Rows=Count(*) From cteExpd Group By ID),
      cteMedn as (Select ID,MedRow1=ceiling(Rows/2.0),MedRow2=ceiling((Rows+1)/2.0) From cteMean),
      cteMode as (Select ID,Mode=Measure2,ModeHits=count(*),ModeRowNr=Row_Number() over (Partition By ID Order By Count(*) Desc) From cteExpd Group By ID,Measure2)
 Select A.ID
       ,A.Measure
       ,MinVal  = min(Measure2)
       ,MaxVal  = max(Measure2)
       ,Mean    = max(B.Mean)
       ,Median  = isnull(Avg(IIF(ExtRowNr between MedRow1 and MedRow2,Measure2,null)),A.Measure)
       ,ModeR1  = isnull(max(IIf(ModeHits>1,D.Mode,null)),min(Measure2))
       ,ModeR2  = isnull(max(IIf(ModeHits>1,D.Mode,null)),max(Measure2))
  From  cteExpd A
  Join  cteMean B on (A.ID=B.ID)
  Join  cteMedn C on (A.ID=C.ID)
  Join  cteMode D on (A.ID=D.ID and ModeRowNr=1)
  Group By A.ID
          ,A.Measure
  Order By A.ID

返回

ID  Measure MinVal  MaxVal  Mean        Median      ModeR1  ModeR2
1   25.00   25.00   25.00   25.000000   25.000000   25.00   25.00
2   75.00   25.00   75.00   50.000000   50.000000   25.00   75.00
3   50.00   25.00   75.00   50.000000   50.000000   25.00   75.00
4   25.00   25.00   75.00   43.750000   37.500000   25.00   25.00
5   12.00   12.00   75.00   37.400000   25.000000   25.00   25.00
6   66.00   12.00   75.00   42.166666   37.500000   25.00   25.00
7   45.00   12.00   75.00   42.571428   45.000000   25.00   25.00

对于较小的数据系列,这种cte方法非常轻巧快速

  

现在临时表方法

-- Generate Base Data -- Key ID and Key Measure
Select ID     =TR_Date
      ,Measure=TR_Y10,RowNr = Row_Number() over (Order By TR_Date)
 Into  #Base
 From [Chinrus-Series].[dbo].[DS_Treasury_Rates] 
 Where Year(TR_Date)>=2013

-- Extend Base Data one-to-many
Select A.*,Measure2 = B.Measure,ExtRowNr = Row_Number() over (Partition By A.ID Order By B.Measure) into #Expd From #Base A Join #Base B on (B.RowNr<=A.RowNr) 
Create Index idx on #Expd (ID)

-- Generate Mean for Series
Select ID,Mean=Avg(Measure2),Rows=Count(*) into #Mean From #Expd Group By ID
Create Index idx on #Mean (ID)

-- Calculate Median Row Number(s)  -- If even(avg of middle two rows)
Select ID,MednRow1=ceiling(Rows/2.0),MednRow2=ceiling((Rows+1)/2.0) into #Medn From #Mean
Create Index idx on #Medn (ID)

-- Calculate Mode
Select * into #Mode from (Select ID,Mode=Measure2,ModeHits=count(*),ModeRowNr=Row_Number() over (Partition By ID Order By Count(*) Desc,Measure2 Desc) From #Expd Group By ID,Measure2) A where ModeRowNr=1
Create Index idx on #Mode (ID)

-- Generate Final Results
 Select A.ID
       ,A.Measure
       ,MinVal  = min(Measure2)
       ,MaxVal  = max(Measure2)
       ,Mean    = max(B.Mean)
       ,Median  = isnull(Avg(IIF(ExtRowNr between MednRow1 and MednRow2,Measure2,null)),A.Measure)
       ,ModeR1  = isnull(max(IIf(ModeHits>1,D.Mode,null)),min(Measure2))
       ,ModeR2  = isnull(max(IIf(ModeHits>1,D.Mode,null)),max(Measure2))
  From  #Expd A
  Join  #Mean B on (A.ID=B.ID)
  Join  #Medn C on (A.ID=C.ID)
  Join  #Mode D on (A.ID=D.ID and ModeRowNr=1)
  Group By A.ID
          ,A.Measure
  Order By A.ID

返回

ID          Measure MinVal  MaxVal  Mean    Median  ModeR1  ModeR2
2013-01-02  1.86    1.86    1.86    1.86    1.86    1.86    1.86
2013-01-03  1.92    1.86    1.92    1.89    1.89    1.86    1.92
2013-01-04  1.93    1.86    1.93    1.9033  1.92    1.86    1.93
2013-01-07  1.92    1.86    1.93    1.9075  1.92    1.92    1.92
2013-01-08  1.89    1.86    1.93    1.904   1.92    1.92    1.92
...
2016-07-20  1.59    1.37    3.04    2.2578  2.24    2.20    2.20
2016-07-21  1.57    1.37    3.04    2.257   2.235   2.61    2.61
2016-07-22  1.57    1.37    3.04    2.2562  2.23    2.20    2.20
  

在Excel中验证的两种方法

enter image description here

我应该在最终查询中添加,您当然可以添加/删除STD,Total

等项目