加入

时间:2015-11-05 00:07:26

标签: mysql sql sql-server oracle hive

我需要帮助才能解决这个特定的sql问题,我无法编写存储过程,因为我需要将其移植到Hive。

有两个tQCles Contr和Lvl,我需要将它们连接起来,并使用前一行的值从连接的tQCle中填充LVL的空值。  我有

中的示例tQCles
Contr
|      id |     EFF_DT | M_NBR | ACTY_SEQ_NBR | L_CD |
|---------|------------|-------|--------------|------|
| QQFAE46 | 2000-12-24 |    11 |            1 |  POT |
| QQFAE46 | 2000-12-24 |    11 |            2 |  POT |
| QQFAE46 | 2000-12-24 |    11 |            3 |  POT |
| QCC5433 | 2013-04-21 |    00 |            1 |  MIC |
| QCC5433 | 2013-04-21 |    00 |            2 |  MIC |
| QCC614E | 2015-07-18 |    00 |            1 |  MIC |
| QCC614E | 2015-07-18 |    00 |            4 |  MIC |
| QC56DDF | 1999-10-01 |    14 |            2 |  POT |
| QC56DDF | 1999-10-01 |    14 |            3 |  POT |
| QC56DDF | 1999-10-01 |    14 |            4 |  POT |
| ACB3DC2 | 1999-10-01 |    14 |            1 |  POT |

LVL
|      id |     EFF_DT | M_NBR | ACTY_SEQ_NBR | OCCR |
|---------|------------|-------|--------------|------|
| QQFAE46 | 2000-12-24 |    11 |            1 |  100 |
| QQFAE46 | 2000-12-24 |    11 |            3 |  100 |
| QCC5433 | 2013-04-21 |    00 |            2 |  200 |
| QCC614E | 2015-07-18 |    00 |            3 |  200 |
| QC56DDF | 1999-10-01 |    14 |            1 |    0 |

LEFT JOIn of Contr and Lvl

|      id |     EFF_DT | M_NBR | ACTY_SEQ_NBR | L_CD |      id |     EFF_DT |  M_NBR | ACTY_SEQ_NBR |   OCCR |
|---------|------------|-------|--------------|------|---------|------------|--------|--------------|--------|
| QQFAE46 | 2000-12-24 |    11 |            1 |  POT | QQFAE46 | 2000-12-24 |     11 |            1 |    100 |
| QQFAE46 | 2000-12-24 |    11 |            2 |  POT |  (null) |     (null) | (null) |       (null) | (null) |
| QQFAE46 | 2000-12-24 |    11 |            3 |  POT | QQFAE46 | 2000-12-24 |     11 |            3 |    100 |
| QCC5433 | 2013-04-21 |    00 |            1 |  MIC |  (null) |     (null) | (null) |       (null) | (null) |
| QCC5433 | 2013-04-21 |    00 |            2 |  MIC | QCC5433 | 2013-04-21 |     00 |            2 |    200 |
| QCC614E | 2015-07-18 |    00 |            1 |  MIC |  (null) |     (null) | (null) |       (null) | (null) |
| QCC614E | 2015-07-18 |    00 |            4 |  MIC |  (null) |     (null) | (null) |       (null) | (null) |
| QC56DDF | 1999-10-01 |    14 |            2 |  POT |  (null) |     (null) | (null) |       (null) | (null) |
| QC56DDF | 1999-10-01 |    14 |            3 |  POT |  (null) |     (null) | (null) |       (null) | (null) |
| QC56DDF | 1999-10-01 |    14 |            4 |  POT |  (null) |     (null) | (null) |       (null) | (null) |
| ACB3DC2 | 1999-10-01 |    14 |            1 |  POT |  (null) |     (null) | (null) |       (null) | (null) |

现在我需要用LVl tQCle填充值为空的ACTY_SEQ_NBR的值。 标准是,从CONTR中找到相应的ACTY_SEQ_NBR(即从连接的tQCle的第4列),并从LVL中找到ACTY_SEQ_NBR,其中ACTY_SEQ_NBR较少 大于或等于CONTR ACTY_SEQ_NBR对同一个id,eff_dt和m_nbr的值。

有关。例如,行#2的空ACTY_SEQ_NBR。对应的Contr ACTY_SEQ_NBR为2,LVL的小于2的ACTY_SEQ_NBR值为1.

所以我理想的输出应该就像这一行。

|      id |     EFF_DT | M_NBR | ACTY_SEQ_NBR | L_CD |      id |     EFF_DT |  M_NBR | ACTY_SEQ_NBR |   OCCR |
|---------|------------|-------|--------------|------|---------|------------|--------|--------------|--------|
| QQFAE46 | 2000-12-24 |    11 |            1 |  POT | QQFAE46 | 2000-12-24 |     11 |            1 |    100 |
| QQFAE46 | 2000-12-24 |    11 |            2 |  POT |  (null) |     (null) | (null) |            1 | (null) |

I tried a lag query but its not giving correct output for all values.

我执行了Amniders查询,我更改了结果以获得预期值。 这是我的期望值

|      ID |     EFF_DT | M_NBR | ACTY_SEQ_NBR | L_CD |  LVL_ID | LVL_EFF_DT | LVL_M_NBR | LVL_ACTY_SEQ_NBR |   OCCR | CALC_LVL_ACTY_SEQ_NBR |
|---------|------------|-------|--------------|------|---------|------------|-----------|------------------|--------|-----------------------|
| QQFAE46 | 2000-12-24 |    11 |            1 |  POT | QQFAE46 | 2000-12-24 |        11 |                1 |    100 |                     1 |
| QQFAE46 | 2000-12-24 |    11 |            2 |  POT |  (null) |     (null) |    (null) |           (null) | (null) |                     1 |
| QQFAE46 | 2000-12-24 |    11 |            3 |  POT | QQFAE46 | 2000-12-24 |        11 |                3 |    100 |                     3 |
| QC56DDF | 1999-10-01 |    14 |            2 |  POT |  (null) |     (null) |    (null) |           (null) | (null) |                     1 |
| QC56DDF | 1999-10-01 |    14 |            3 |  POT |  (null) |     (null) |    (null) |           (null) | (null) |                     1 |
| QC56DDF | 1999-10-01 |    14 |            4 |  POT |  (null) |     (null) |    (null) |           (null) | (null) |                     1 |
| QCC5433 | 2013-04-21 |    00 |            1 |  MIC |  (null) |     (null) |    (null) |           (null) | (null) |                   -99 |
| QCC5433 | 2013-04-21 |    00 |            2 |  MIC | QCC5433 | 2013-04-21 |        00 |                2 |    200 |                     2 |
| QCC614E | 2015-07-18 |    00 |            1 |  MIC |  (null) |     (null) |    (null) |           (null) | (null) |                   -99 |
| QCC614E | 2015-07-18 |    00 |            4 |  MIC |  (null) |     (null) |    (null) |           (null) | (null) |                     3 |
| ACB3DC2 | 1999-10-01 |    14 |            1 |  POT |  (null) |     (null) |    (null) |           (null) | (null) |                   -99 |

感谢任何帮助

4 个答案:

答案 0 :(得分:1)

感谢更广泛的例子!这可能是单个完全外连接可以做到的,但我认为聚合和过滤会变得混乱。最简单的选择是连接到lvl两次,首先找到"之前的act_seq_nbr",然后再次像你一样(但是使用coalesce来使用"之前的act_seq_nbr"当没有act_seq_nbr是实测值):

SELECT  c.id,c.eff_dt,c.m_nbr,c.acty_seq_nbr, 
        l.id,l.eff_dt,l.m_nbr,
        coalesce(l.acty_seq_nbr, prev_acty_seq_nbr, -99) l_acty_seq_nbr
from 
    (       
    select c.id,c.eff_dt,c.m_nbr,c.acty_seq_nbr, 
           MAX(L.acty_seq_nbr) prev_acty_seq_nbr
    from   contr c 
    left join lvl l 
        on 
            c.id=l.id 
            and c.eff_dt=l.eff_dt 
            and c.m_nbr=l.m_nbr 
            and c.acty_seq_nbr>l.acty_seq_nbr
    GROUP BY   
        c.id,c.eff_dt,c.m_nbr,c.acty_seq_nbr
    ) c
left join lvl l 
    on 
        c.id=l.id 
        and c.eff_dt=l.eff_dt 
        and c.m_nbr=l.m_nbr     
        and c.acty_seq_nbr=l.acty_seq_nbr;

小提琴: http://www.sqlfiddle.com/#!6/1270f/74/0

结果:

|      id |     eff_dt | m_nbr | acty_seq_nbr |      id |     eff_dt |  m_nbr | l_acty_seq_nbr |
|---------|------------|-------|--------------|---------|------------|--------|----------------|
| AAFAE46 | 2000-12-24 |    11 |            1 | AAFAE46 | 2000-12-24 |     11 |              1 |
| AAFAE46 | 2000-12-24 |    11 |            2 |  (null) |     (null) | (null) |              1 |
| AAFAE46 | 2000-12-24 |    11 |            3 | AAFAE46 | 2000-12-24 |     11 |              3 |
| AB56DDF | 1999-10-01 |    14 |            2 |  (null) |     (null) | (null) |              1 |
| AB56DDF | 1999-10-01 |    14 |            3 |  (null) |     (null) | (null) |              1 |
| AB56DDF | 1999-10-01 |    14 |            4 |  (null) |     (null) | (null) |              1 |
| ABC5433 | 2013-04-21 |    00 |            1 |  (null) |     (null) | (null) |            -99 |
| ABC5433 | 2013-04-21 |    00 |            2 | ABC5433 | 2013-04-21 |     00 |              2 |
| ABC614E | 2015-07-18 |    00 |            1 |  (null) |     (null) | (null) |            -99 |
| ABC614E | 2015-07-18 |    00 |            4 |  (null) |     (null) | (null) |              3 |
| ACB3DC2 | 1999-10-01 |    14 |            1 |  (null) |     (null) | (null) |            -99 |            

答案 1 :(得分:1)

试试这个:

SELECT
     ID, EFF_DT, M_NBR, ACTY_SEQ_NBR, L_CD, LVL_ID, LVL_EFF_DT, LVL_M_NBR, LVL_ACTY_SEQ_NBR, OCCR
   ,COALESCE(CASE WHEN LVL_ACTY_SEQ_NBR IS NULL THEN COALESCE(LAG(ACTY_SEQ_NBR) OVER (PARTITION BY ID, EFF_DT, M_NBR ORDER BY ACTY_SEQ_NBR),ACTY_SEQ_NBR) ELSE LVL_ACTY_SEQ_NBR END,'NA') LVL_NMBR
    FROM(
    SELECT A.ID, A.EFF_DT, A.M_NBR, A.ACTY_SEQ_NBR, A.L_CD
    , B.ID LVL_ID, B.EFF_DT LVL_EFF_DT, B.M_NBR LVL_M_NBR, B.ACTY_SEQ_NBR LVL_ACTY_SEQ_NBR , B.OCCR
    FROM EDWT.CONTR A
    LEFT JOIN EDWT.LVL B
    ON A.ID = B.ID AND A.ACTY_SEQ_NBR = B.ACTY_SEQ_NBR and a.eff_dt=b.eff_dt and a.m_nbr=b.m_nbr ) A;

另外,请帮助处理以下行的有效值:

  

ID EFF_DT M_NBR ACTY_SEQ_NBR L_CD LVL_ID LVL_EFF_DT LVL_M_NBR LVL_ACTY_SEQ_NBR OCCR LVL_NMBR

     

ABC5433 2013-04-21 00 1 IMC 1

您对ABC614E& amp; lvl_acty_seq_nbr的期望是什么? ABC5433。 ABC614E在lvl表中没有acty_Seq_nbr,并且有1& 4in控制请提供上述ID的预期输出。您想要所有ID中相同ID或最低值的最低值吗?

答案 2 :(得分:0)

我希望我能理解你的问题。试试这个问题:

 SELECT
 ID, EFF_DT, M_NBR, ACTY_SEQ_NBR, L_CD, LVL_ID, LVL_EFF_DT, LVL_M_NBR, LVL_ACTY_SEQ_NBR, OCCR
 --,LAG(ACTY_SEQ_NMBR) OVER (PARTITION BY ID, EFF_DT, M_NBR ORDER BY ACTY_SEQ_NMBR)
 ,COALESCE(CASE WHEN LVL_ACTY_SEQ_NBR IS NULL THEN LAG(ACTY_SEQ_NBR) OVER (PARTITION BY ID, EFF_DT, M_NBR ORDER BY ACTY_SEQ_NBR) ELSE LVL_ACTY_SEQ_NBR END,'NA') CALC_LVL_ACTY_SEQ_NBR    FROM(
SELECT A.ID, A.EFF_DT, A.M_NBR, A.ACTY_SEQ_NBR, A.L_CD
, B.ID LVL_ID, B.EFF_DT LVL_EFF_DT, B.M_NBR LVL_M_NBR, B.ACTY_SEQ_NBR LVL_ACTY_SEQ_NBR , B.OCCR
FROM EDWT.CONTR A
LEFT JOIN EDWT.LVL B
ON A.ID = B.ID AND A.ACTY_SEQ_NBR = B.ACTY_SEQ_NBR and a.eff_dt=b.eff_dt and a.m_nbr=b.m_nbr ) A;

答案 3 :(得分:0)

我不清楚你想如何处理你提供的两行以外的例子......如果你可以填补其余数据的空白,那会有所帮助。

与此同时,这是一个裂缝:

with foo as (
  select
    c.id as cid, c.eff_dt as c_eff_dt,
    c.m_nbr as c_m_nbr, c.acty_seq_nbr as c_acty_seq_nbr,
    l.id as lid, l.eff_dt as l_eff_dt, l.m_nbr as l_m_nbr,
    l.acty_seq_nbr as l_acty_seq_nbr,
    sum (case when l.id is null then 0 else 1 end)
      over (partition by c.id order by c.acty_seq_nbr) as idx
  from
    contr c
    left join lvl l on
      c.id=l.id and 
      c.eff_dt=l.eff_dt and 
      c.m_nbr=l.m_nbr and 
      c.acty_seq_nbr=l.acty_seq_nbr
)
select
  cid, c_eff_dt, c_m_nbr, c_acty_seq_nbr,
  lid, l_eff_dt, l_m_nbr, l_acty_seq_nbr,
  min (c_acty_seq_nbr) over (partition by cid, idx) as acty_seq_nbr
from foo