SQL - 根据另一列中的最大值和另一列中的值组合选择列值 - Teradata

时间:2017-07-26 17:34:46

标签: sql select teradata

我的输入Teradata表accnt_pln_info样本数据如下。

Account_number   Plan_code   Plan_Date    Base_Amount     Biz_Date
ACCT1            R           2017-JAN-01         100      2017-MAY-31
ACCT1            R           2017-JAN-11          30      2017-MAY-31
ACCT1            K           2017-JAN-22          80      2017-MAY-31
ACCT1            B           2017-JAN-13          50      2017-MAY-31
ACCT1            C           2017-JAN-18         180      2017-MAY-31
ACCT2            R           2017-JAN-12          70      2017-MAY-31
ACCT2            C           2017-JAN-02          90      2017-MAY-31
ACCT2            R           2017-JAN-08          10      2017-MAY-31
ACCT2            D           2017-JAN-02          40      2017-MAY-31
ACCT2            B           2017-FEB-24          14      2017-MAY-31
ACCT2            K           2017-FEB-12          79      2017-MAY-31

期望的输出:(对于过滤条件Biz_Date = 2017-MAY-31

Account_number   RK_Plan_Date    RK_Base_Amount   RC_Plan_Date   RC_Base_Amount
ACCT1            2017-JAN-22          80          2017-JAN-18         180
ACCT2            2017-FEB-12          79          2017-JAN-12          70    

逻辑:

Filter condition applied Biz_Date=2017-MAY-31 as table has multiple distinct biz_dates.
Group by Account_Number;  Plan_Date in (R,K), 
find the max Plan_Date and then get that rows Base_Amount; 
Plan_Date in (R,C), find the max Plan_Date and 
then get that rows Base_Amount.

例如: 对于ACCT1和plan_code(' R',' K'),最大plan_date值为2017-JAN-22;因此需要获得该行的Base_amount为80

假设:

There can be duplicates on Account_number and Plan_Code.
There will not be duplicates on Account_number, Plan_Code in (R,K) and Plan_Date.
There will not be duplicates on Account_number, Plan_Code in (R,C) and Plan_Date.
The input order in table is not necessarily the same. 

我尝试过但失败了:

SELECT ACCOUNT_NUMBER, 
MAX(CASE WHEN PLAN_DATE IN ('R','K') THEN PLAN_DATE END) MAX_RK_PLAN_DATE,
MAX(CASE WHEN PLAN_DATE IN ('R','K') AND MAX_PLAN_DATE=PLAN_DATE THEN BASE_AMOUNT END) REQUIRED_RK_AMOUNT,
MAX(CASE WHEN PLAN_DATE IN ('R','C') THEN PLAN_DATE END) MAX_RC_PLAN_DATE,
MAX(CASE WHEN PLAN_DATE IN ('R','C') AND MAX_PLAN_DATE=PLAN_DATE THEN BASE_AMOUNT END) REQUIRED_RC_AMOUNT 
FROM ACCNT_PLN_INFO;

正如预期的那样,它失败了,因为我将聚合函数嵌套到正常的case语句中。 我想通过将其拆分为

来使用数据块
SELECT ....
(SELECT ACCOUNT_NUMBER, 'RK', 
MAX(PLAN_DATE) MAX_RK_PLAN_DATE FROM ACCNT_PLN_INFO WHERE 
PLAN_DATE IN ('R','K') 
UNION 
SELECT ACCOUNT_NUMBER, 'RC', 
MAX(PLAN_DATE) MAX_RC_PLAN_DATE FROM ACCNT_PLN_INFO WHERE 
PLAN_DATE IN ('R','C') )

并希望再次从同一个表中加入外部选择。但由于(R.K)和(R,C)的不同可能组合,我无法做到这一点。我知道如果没有涉及组合,如何实现它。

为方便起见,我只指定了2个组合,其中2个值为PLAN_DATE IN(' R',' K'); PLAN_DATE IN(' R'' C')。但实际上有6种组合,每种组合都有4种值。

我曾尝试过任何能做到这一点的事情。但不幸的是,无法做到。当我们需要多个值组合和最大列值时,如何选择列值。谢谢你宝贵的时间。

2 个答案:

答案 0 :(得分:0)

编辑:使用资格重写。

您需要获取每个plan_code配对的最大计划日期。您可以在两个单独的派生表中执行此操作,使用qualify获取最大计划日期的数据。然后,您可以使用account_number将这两个结果合并在一起。

select
rk.account_number,
rk_plan_date,
rk.base_amount as rk_base_amount,
rc.rc_plan_date,
rc.base_amount as rc_base_amount
from
(
select
    ACCNT_PLN_INFO.account_number,
    ACCNT_PLN_INFO.plan_date as rk_plan_date,
    base_amount
from 
    ACCNT_PLN_INFO
where
    plan_code in ('R','K')
qualify row_number() over (partition by ACCNT_PLN_INFO.account_number order by plan_date desc) = 1
) rk
inner join 
(select
    ACCNT_PLN_INFO.account_number,
    ACCNT_PLN_INFO.plan_date as rc_plan_date,
    base_amount
from 
    ACCNT_PLN_INFO
where
    plan_code in ('R','C')
qualify row_number() over (partition by ACCNT_PLN_INFO.account_number order by plan_date desc) = 1
)RC
on RK.account_number = rc.account_number

原始(非teradata特定语法):

select
rk.account_number,
rk_plan_date,
rk.base_amount as rk_base_amount,
rc.rc_plan_date,
rc.base_amount as rc_base_amount
from (
    select
    ACCNT_PLN_INFO.account_number,
    ACCNT_PLN_INFO.plan_date as rk_plan_date,
    base_amount
    from 
    ACCNT_PLN_INFO
    inner join (
    select
    account_number,
    max(plan_date) as plan_date
    from
    ACCNT_PLN_INFO
    where
    plan_code in ('R','K')
    group by 1) rk
        on ACCNT_PLN_INFO.account_number = rk.account_number
        and ACCNT_PLN_INFO.plan_date = rk.plan_date
        and ACCNT_PLN_INFO.plan_code in ('R','K')
) RK
inner join (    
select
ACCNT_PLN_INFO.account_number,
ACCNT_PLN_INFO.plan_date as rc_plan_date,
base_amount
from 
ACCNT_PLN_INFO
inner join (
select
account_number,
max(plan_date) as plan_date
from
ACCNT_PLN_INFO
where
plan_code in ('R','C')
group by 1) rc
    on ACCNT_PLN_INFO.account_number = rc.account_number
    and ACCNT_PLN_INFO.plan_date = rc.plan_date
    and ACCNT_PLN_INFO.plan_code in ('C','R')
) RC
on RK.account_number = rc.account_number

答案 1 :(得分:0)

您可以使用类似于您尝试应用脏ol'的聚合的方法。技巧,捎带

您将两列合并为一个字符串,应用MAX然后再次剥离日期部分,例如ACCT1合并PLAN_DATEBASE_AMOUNT '20170101 100' '20170111 30' '20170113 50' '20170118 180' '20170122 80' -- this will be returned by MAX 成一个字符串会导致:

   CAST(SUBSTR('2017-01-22         80', 1, 10) AS DATE)
   CAST(SUBSTR('2017-01-22         80', 11) AS INT)

应用max后,使用SUBSTRING再次提取两列:

yyyymmdd

当然,您必须创建一个仍在按正确方式排序的字符串,例如SELECT ACCOUNT_NUMBER, To_Date(Substr(RK, 1,8), 'yyyymmdd') AS MAX_RK_PLAN_DATE, Cast(Substring(RK From 9) AS INT) AS REQUIRED_RK_AMOUNT, To_Date(Substr(RC, 1,8), 'yyyymmdd') AS MAX_RC_PLAN_DATE, Cast(Substring(RC From 9) AS INT) AS REQUIRED_RC_AMOUNT FROM ( SELECT ACCOUNT_NUMBER, Max(CASE WHEN PLAN_code IN ('R','K') THEN To_Char(PLAN_DATE, 'yyyymmdd') || BASE_AMOUNT END) AS RK, Max(CASE WHEN PLAN_code IN ('R','C') THEN To_Char(PLAN_DATE, 'yyyymmdd') || BASE_AMOUNT END) AS RC FROM ACCNT_PLN_INFO WHERE biz_date = DATE '2017-05-31' GROUP BY 1 ) AS dt 表示日期和固定宽度,包括数字的前导空格。

现在它有一些Cut& Paste& Modify:

SELECT loc.location AS LOCATION , req.requisition AS REQ 
FROM location_view loc, requisition_view req, association ass 
WHERE loc.name = 'ABC' AND req.name = 'TRANSFER' 
AND ass.entity_id_2 = req.entity_id AND ass.entity_id_1 = loc.entity_id