GROUP BY调用CLOB数据

时间:2017-02-23 17:17:33

标签: sql oracle group-by clob

三个表格之间有联接,test_3test_2test_1

test_1test_3是主要表格,没有公共列。表test_2加入了表格。 test_1sr_idlast_updated_date
test_2sr_idsm_idtest_3sm_idsql_statementtest_3有clob数据导致所有麻烦。

我必须找到与sr_id相关联的最新sm_id。我的想法是使用聚合函数max(last_updated_date)并将其分组。 它并没有因为很多原因而发生。

  1. 它包含列为sql_statement的CLOB数据。

  2. 我使用了一个我不熟悉的联接。

  3. 任何想法都会有所帮助。

    WITH xx as (
        (select  ANSWER ,sr_id AS ID from test 
        WHERE Q_ID in (SELECT Q_ID FROM test_2 WHERE field_id='LM_LRE_Q6')
        ) 
    )
    -- end of source data
    
    
    SELECT t.ID, t1.n, t1.SM_ID,seg_dtls.SEGMENTation_NAME ,to_char(mst.LAST_UPDATED_DATE,'dd-mon-yyyy hh24:mi:ss'),seg_dtls.sql_statement
    FROM xx t
    CROSS JOIN LATERAL (
            select LEVEL AS n, regexp_substr( t.answer, '\d+',  1, level) as SM_ID
            from dual
            connect by regexp_substr( t.answer, '\d+',  1, level) IS NOT NULL
    ) t1
    left join test_1 mst 
    on mst.sr_id=t.id
    right join test_3 seg_dtls
    on seg_dtls.sm_id=t1.sm_id;
    

    示例数据看起来像

    sr_id   sm_id SEGMENTATION_NAME  LAST_UPDATED_DATE  
    1108197 958   test_not_in          05-feb-2017 23:56:59    
    1108217 958   test_not_in          14-feb-2017 00:37:39  
    1108218 958   test_not_in          14-feb-2017 01:39:50  
    1108220 958   test_not_in          14-feb-2017 03:39:07  
    

    ,预期输出为

    1108220 958   test_not_in          14-feb-2017 03:39:07  
    

    我不发布CLOB数据,因为它很大。 每一行都包含CLOB数据。

    table test_3 contains  
    q_id     sr_id  answer   
    1009330 1108246 976~feb_24^941~Test_regionwithcountry  
    1009330 1108247 941~Test_regionwithcountry_2016^787~Test_Request_28^976~feb_24  
    1009330 1108239 972~test_emea  
    1009330 1108240 972~test_emea^827~test_with_region_country  
    1009330 1108251 981~MSE100579729 testing.
    

    和示例数据类似于test_3的上面 答案包含sm_id。我必须从这里拉出来。
    例如:

    941~Test_regionwithcountry_2016^787~Test_Request_28^976~feb_24  
    the sm_id is 941,787,976 
    

    所以我来了上面发布的上述查询 再次,进入左右连接,需要来自test_3的所有sm_id,所以我在这里使用了正确的连接。

    edit1:接受的答案为SR_ID OF SEGMENTS提供max(last_updated_date)。
    我需要所有SR_ID。所以,我使用MINUS运算符来获取那些不是max(last_updated_date) 我需要将结果集附加到接受的答案。

    这是我为获取其他SR_ID所做的。

    select sr_id,segmentation_name,request_status from (with test_31 (q_id, sr_id, answer) as (
     (SELECT Q_ID,SR_ID,ANSWER FROM test_3 WHERE Q_ID=(SELECT Q_ID FROM test_4 WHERE FIELD_ID='LM_LRE_Q6'))
    ),
    answer_extraction as (
      select q_id, sr_id,
        regexp_substr(regexp_substr(answer, '[^^]+', 1, level),'\d+') as sm_id
      from test_31
      connect by q_id = prior q_id
      and sr_id = prior sr_id
      and prior dbms_random.value is not null
      and regexp_substr(answer, '[^^]+', 1, level) is not null
    )
    select sr_id,
      sm_id,
      segmentation_name,
      LAST_UPDATED_DATE,
      sql_statement,request_status
    from (
      select t1.sr_id,
        t2.sm_id,
        t2.segmentation_name,
        t1.last_updated_date,
        t2.sql_statement,
        t1.request_status
    
      from test_4 t4
      join answer_extraction t3 on t3.q_id = t4.q_id
      join test_2 t2 on t2.sm_id = t3.sm_id
      join test1 t1 on t1.sr_id = t3.sr_id
    )
    )
    minus
    
    (select  sr_id,segmentation_name , request_status from (with test_31 (q_id, sr_id, answer) as (
     (SELECT Q_ID,SR_ID,ANSWER FROM test_3 WHERE Q_ID=(SELECT Q_ID FROM test_4 WHERE FIELD_ID='LM_LRE_Q6'))
    ),
    answer_extraction as (
      select q_id, sr_id,
        regexp_substr(regexp_substr(answer, '[^^]+', 1, level), '\d+') as sm_id
      from test_31
      connect by q_id = prior q_id
      and sr_id = prior sr_id
      and prior dbms_random.value is not null
      and regexp_substr(answer, '[^^]+', 1, level) is not null
    )
    select sr_id,
      segmentation_name,
      sql_statement,
       request_status
    from (
      select t1.sr_id,
        t2.sm_id,
        t2.segmentation_name,
        t1.last_updated_date,
        t2.sql_statement,
         t1.request_status,
        max(t1.last_updated_date) over (partition by t2.sm_id) as max_updated_date
      from test_4 t4
      join answer_extraction t3 on t3.q_id = t4.q_id
      join test_2 t2 on t2.sm_id = t3.sm_id
      join test_1 t1 on t1.sr_id = t3.sr_id
    )
    where last_updated_date = max_updated_date));
    

    }

    样本数据:
    接受的答案给出了以下输出与段的max(last_updated_date)。

    1097661 Submitted   o2k lad 30-NOV-15   01-DEC-16   62  CLOB DATA  
    

    上面发布的查询GIVES下面的输出是具有其他更新日期的段的sr_id。

     1097621    o2k lad Submitted
        1097625 o2k lad Submitted
        1097627 o2k lad Submitted
        1097632 o2k lad Submitted
        1097633 o2k lad Submitted
        1097658 o2k lad Pending
        1097640 o2k lad Submitted
        1097644 o2k lad Submitted
        1097646 o2k lad Submitted
    

    预期产出:

      sr_id status     segment_name updated_date sql_statement other_sr_id
    1097661 Submitted   o2k lad     30-NOV-15     CLOB DATA 1097618,1097621,1097625,1097627,1097632,1097633,1097658,1097640,1097644,1097646
    

    合并两个查询,以便最后一列包含所有旧的sr_id。

0 个答案:

没有答案