开始日期到结束日期的平均金额

时间:2018-07-22 08:16:54

标签: sql teradata

我正在使用SQL Teradata,并且有这样一个表:

cust_id start_dt    end_dt      amount  is_current_y_n
12345   1/8/2018    7/8/2018    7044    N
12345   7/9/2018    7/10/2018   8142    N
12345   7/11/2018   7/13/2018   7643    N
12345   7/14/2018   7/14/2018   8630    N
12345   7/14/2018   7/19/2018   5597    N
12345   7/20/2018   12/31/9999  5680    Y

我见过的另一种情况:

cust_id start_dt    end_dt      amount  is_current_y_n
54321   1/1/2015    12/31/9999  8650    Y

我需要使用SQL计算过去的平均金额:

7 days
30 days
90 days
180 days

“平均值”,即如果在过去7天中第3天的金额从1000变为2000,则平均值应为:

(1000x3 + 2000x4)/ 7

我试图将表与日期表连接起来,但是效率不高。

有没有有效的方法来实现这一目标?

3 个答案:

答案 0 :(得分:2)

可以通过递归公用表表达式查询来完成。 要显示这些日期范围。
使用每个日期的金额,可以将CTE重新加入表格以获取这些平均值。

我无法在TeraData (没有)上测试SQL
但是它应该几乎可以在该RDBMS (可能)

上运行
WITH RECURSIVE CTE (cust_id, dt, amount, start_dt, end_dt) AS 
(
  SELECT cust_id, start_dt as dt, amount, start_dt,
  case when end_dt - start_dt > 4200 then start_dt else end_dt end
  FROM table_as_such
  UNION ALL
  SELECT cust_id, dt+1, amount, start_dt, end_dt
  FROM CTE
  WHERE dt < end_dt
)
SELECT t.cust_id, t.start_dt
, ROUND(AVG(case when CTE.dt between t.start_dt - 7 and t.start_dt then CTE.amount end),2) as avg7
, ROUND(AVG(case when CTE.dt between t.start_dt - 30 and t.start_dt then CTE.amount end),2) as avg30
, ROUND(AVG(case when CTE.dt between t.start_dt - 90 and t.start_dt then CTE.amount end),2) as avg90
, ROUND(AVG(case when CTE.dt between t.start_dt - 180 and t.start_dt then CTE.amount end),2) as avg180
FROM table_as_such t
JOIN CTE ON (CTE.cust_id = t.cust_id AND CTE.dt between t.start_dt - 180 and t.start_dt)
GROUP BY t.cust_id, t.start_dt
ORDER BY t.cust_id, t.start_dt;

使用的样本数据:

create table table_as_such (id int not null primary key, cust_id int, start_dt date, end_dt date, amount int, is_current_y_n char(1));
insert into table_as_such values (1,12345,'2018-01-08','2018-07-08',7044,'N');
insert into table_as_such values (2,12345,'2018-07-09','2018-07-10',8142,'N');
insert into table_as_such values (3,12345,'2018-07-11','2018-07-13',7643,'N');
insert into table_as_such values (4,12345,'2018-07-14','2018-07-14',8630,'N');
insert into table_as_such values (5,12345,'2018-07-14','2018-07-19',5597,'N');
insert into table_as_such values (6,12345,'2018-07-20','9999-12-31',5680,'Y');

答案 1 :(得分:1)

在这种情况下,Teradata的时间特性也许可以为您提供帮助。这是由于PERIOD数据类型和扩展功能所致。

检查此示例以了解此功能和您的意图:

database demo;

create table demoDateExpand (
  myID integer
 ,myUser VARCHAR(100)
 ,myAmount DECIMAL(10,2)
 ,startDT DATE
 ,endDT   DATE
) no primary index;

insert into demoDateExpand values (1, 'User01', 2.5, '2018-01-01', '2018-01-05');
insert into demoDateExpand values (2, 'User01', 3.0, '2018-01-08', '2018-01-15');
insert into demoDateExpand values (3, 'User01', 1.5, '2018-01-11', '2018-01-25');
insert into demoDateExpand values (4, 'User02', 2.0, '2018-01-01', '2018-01-15');
insert into demoDateExpand values (5, 'User02', 2.5, '2018-01-05', '2018-01-25');
insert into demoDateExpand values (6, 'User02', 4.5, '2018-01-26', '2018-01-27');
insert into demoDateExpand values (7, 'User03', 1.0, '2018-01-10', '2018-01-15');
insert into demoDateExpand values (8, 'User03', 3.5, '2018-01-16', '2018-01-25');

select myID
      ,myUser
      ,myAmount
      ,startDT
      ,endDT
      ,period(startDT, endDT)
  from demoDateExpand
;


select myID
      ,myUser
      ,myAmount
      ,BEGIN(myDate)
  from demoDateExpand
  expand on period(startDT, endDT) AS myDate BY ANCHOR DAY
  order by myID, myDate
;

答案 2 :(得分:0)

我借助带有日期的表设法创建了自己的查询:

2017-07-11
2017-07-12
...

我的查询是:

sel 
    c.cust_id
    ,avg(case when c.cal_dt between '2017-07-01' and '2018-01-01' then c.amount end) as avg_180
    ,avg(case when c.cal_dt between '2017-10-01' and '2018-01-01' then c.amount end) as avg_90
    ,avg(case when c.cal_dt between '2017-12-01' and '2018-01-01' then c.amount end) as avg_30
    ,avg(case when c.cal_dt between '2017-12-24' and '2018-01-01' then c.amount end) as avg_7
from
(
sel b.cust_id
    ,a.cal_dt
    ,b.amount
from
(
sel *
from CALENDAR_DAILY_TABLE
where cal_dt between '2017-07-01' and '2018-01-01'
) as a

join

(
sel *
from MY_TABLE
where  (start_dt > '2017-07-01' or end_dt='9999-12-31')
) as b

on  b.start_dt<=a.cal_dt and a.cal_dt<=b.end_dt

) as c
where c.cust_id ='12345'
group by c.cust_id

结果是:

cust_id     avg_180     avg_90      avg_30      avg_7
12345       1.34        1.34        1.34        1.34

谢谢!