Joining specific days of data to a daily time series (Teradata SQL)

时间:2018-04-18 18:07:58

标签: sql teradata

I had a hard time summarizing what I am looking to do in the title but the example I have should make sense. I am trying to write a more efficient query in Teradata that will be used in Tableau. I can get it done with the brute strength an ignorance approach but after a while I run out of spool space so I need to make it more efficient.

Let's say I have two tables, a customer table with customer attributes and a daily balance table (it is more complex than that but this is the important part). I want to write a query that returns the daily balance per customer along with other columns that are a specific days balance for that customer regardless of date field in the final table.

Example:

Customer Table

CustID | CustState | CustType | ...
001    | NY        | A        | ...
002    | CA        | B        | ...    
003    | NC        | C        | ...

Balance Table

CustID | Date      | Balance
001    |04/01/2018 | 100
001    |04/02/2018 | 105
001    |04/03/2018 | 110
002    |04/01/2018 | 5000
002    |04/02/2018 | 15000
002    |04/03/2018 | 25

Final Query Results

CustID | CustState | Date      | Balance | Balance42 | Balance43
001    | NY        |04/01/2018 | 100     | 105       | 110
001    | NY        |04/02/2018 | 105     | 105       | 110
001    | NY        |04/03/2018 | 110     | 105       | 110
002    | CA        |04/01/2018 | 5000    | 1500      | 25
002    | CA        |04/02/2018 | 15000   | 1500      | 25
002    | CA        |04/03/2018 | 25      | 1500      | 25

As you can see the first four columns are straight forward, the last two represent the balance from 4/2/2018 and 4/3/2018 respectively. I am currently doing this like the following where I use multiple joins/subqueries to get the specific balances:

select a.CustID
  , a.CustState
  , b.Date
  , sum(b.Balance) as Balance
  , c.Balance as Balance42
  , d.Balance as Balance43

from Customer a

inner join Balance b on a.CustID=b.CustID

inner join (
  select aa.CustID
    , sum(bb.Balance) as Balance
  from Customer aa
  inner join Balance bb on aa.CustID=bb.CustID
  where aa.CustType in ('A','B')
    and bb.Date=DATE '2018-04-02
  group by aa.CustID
) c on a.CustID=c.CustID 

inner join (
  select aa.CustID
    , sum(bb.Balance) as Balance
  from Customer aa
  inner join Balance bb on aa.CustID=bb.CustID
  where aa.CustType in ('A','B')
    and bb.Date=DATE '2018-04-03
  group by aa.CustID
) d on a.CustID=c.CustID 

where a.CustType in ('A','B')

group by a.CustID
  , a.CustState
  , b.Date
  , c.Balance
  , d.Balance

Is there a way that I can do this with only one join/subquery to be more efficient? I start to run out of spool space when I add too many joins/subqueries but I have a specific business use for why I am trying to get the final results structure.

3 个答案:

答案 0 :(得分:3)

您需要条件agregation ,但在您的情况下,它基于 Windowed Aggregate

select a.CustID
  , a.CustState
  , b.Date
  , sum(b.Balance) as Balance 

  , max(case when b.Date=DATE '2018-04-02' then sum(b.Balance) end)
    over (partition by a.CustID) as Balance42

  , max(case when b.Date=DATE '2018-04-03' then sum(b.Balance) end)
    over (partition by a.CustID) as Balance43

from Customer a

inner join Balance b on a.CustID=b.CustID

where a.CustType in ('A','B')

group by a.CustID
  , a.CustState
  , b.Date

答案 1 :(得分:0)

不确定我是否完全得到你想要做的事情。但似乎你应该能够在一个声明中使用case语句进行最后两次计算:

select a.CustID
  , a.CustState
  , b.Date
  , sum(b.Balance) as Balance
  , sum (case when b.date = '2018-04-02' then b.balance else null end) as balance42
  , sum (case when b.date = '2018-04-03' then b.balance else null end) as balance 43
from Customer a

inner join Balance b on a.CustID=b.CustID

答案 2 :(得分:0)

没有OLAP的替代查询(仅当Customer.CustID为PK时才有效)

with x as (
  select a.CustID
    , a.CustState
    , b.Date
    , sum(b.Balance) as Balance 
  from Customer a
  inner join Balance b on a.CustID=b.CustID
  where a.CustType in ('A','B')
  group by a.CustID
    , a.CustState
    , b.Date
)
select x.CustID
    , x.CustState
    , x.Date
    , x.Balance
    , d1.Balance as Balance42
    , d2.Balance as Balance43
from x
inner join x d1 when d1.CustID = x.CustID and d1.Date=DATE '2018-04-02'
inner join x d2 when d2.CustID = x.CustID and d2.Date=DATE '2018-04-03'