SQL语法根据字段标准和输出总记录计数聚合记录

时间:2016-09-20 19:22:04

标签: sql syntax teradata

SQL查询返回如下所示的表结果:

enter image description here

我正在运行以生成此表的SQL如下所示:

SELECT DISTINCT 
    ld.LOAD_ID, lf.ACTUAL_DELIVERY_TS, ld.ORIG_LOC_ID, 
    fsd.PO_TYPE_CODE, fsf.CASE_QTY, 
    CASE 
       WHEN ORIG_LOC_ID IN (6903, 6909, 6912, 7100, 7101, 7183, 7184, 7837, 7840, 7976) 
          THEN 'Centerpoint'
       WHEN ORIG_LOC_ID IN (6061, 6088, 6060, 7042, 7078, 7085, 7086, 7084, 7089, 7093, 7094, 4892, 7628) 
          THEN 'Imports'
       WHEN ORIG_LOC_ID IN (8092, 8098, 9153, 9193, 9195, 9196) 
          THEN 'Returns'
       WHEN ORIG_LOC_ID IN (6005, 6007, 6008, 6014, 6022, 6029, 6041) 
          THEN 'Fashion'
       WHEN ORIG_LOC_ID IN (7006, 7356, 6280, 8240, 8103, 7853, 7005) 
          THEN 'eCom'
       WHEN ORIG_LOC_ID IN (6006, 6009, 6010, 6011, 6012, 6016, 6017, 6018, 6019, 6020, 6021, 6023, 6024, 6025, 6026, 6027, 6030, 6031, 6035, 6036, 6037, 6038, 6039, 6040, 6043, 6048, 6054, 6066, 6068, 6069, 6070, 6080, 6092, 6094, 7026, 7033, 7034, 7035, 7036, 7038, 7039, 7045) 
          THEN 'Regional'
       WHEN ORIG_LOC_ID IN (6042, 6047, 6055, 6056, 6057, 6059, 6062, 6064, 6065, 6071, 6072, 6073, 6074, 6077, 6082, 6083, 6084, 6085, 6090, 6091, 6095, 6096, 6097, 6099, 7010, 7012, 7013, 7014, 7015, 7016, 7017, 7018, 7019, 7021, 7023, 7024, 7025, 7030, 7047, 7048, 7053, 7055, 7068, 7070, 7077, 7079) 
          THEN 'Grocery'
       WHEN ORIG_LOC_TYPE_CODE IN ('VNDR') 
          THEN 'VNDR'
       WHEN ORIG_LOC_TYPE_CODE IN ('STORE') 
          THEN 'STORE'
    END AS LOADORIG, 
    CASE
       WHEN PO_TYPE_CODE IN (23, 33, 3, 53, 45, 73, 93) THEN 'DA'
       WHEN PO_TYPE_CODE IN (20, 22, 28, 29, 40, 42, 50, 83) THEN 'SS'
       WHEN PO_TYPE_CODE IN (10, 11, 12, 13, 14, 15, 16, 17, 18, 19) THEN 'Fashion'
       WHEN PO_TYPE_CODE IN (43) THEN 'XDOCK'
    END AS CHANNEL, 
    ld.DEST_LOC_ID
FROM 
    us_trans_dm_vm.LOAD_DIM ld, 
    us_trans_dm_vm.LOAD_FACT lf, 
    us_trans_dm_vm.FREIGHT_SHIPMENT_FACT fsf, 
    us_trans_dm_vm.FREIGHT_SHIPMENT_DIM fsd
WHERE 
    ld.LOAD_SK_ID = lf.LOAD_SK_ID
    AND lf.LOAD_SK_ID = fsf.LOAD_SK_ID
    AND fsf.SHIPMENT_SK_ID = fsd.SHIPMENT_SK_ID
    AND fsd.CURRENT_IND = 'Y'
    AND ld.CURRENT_IND = 'Y'
    AND ld.DEST_LOC_ID IN (6006, 6009, 6010, 6011, 6012, 6016, 6017, 6018, 6019, 6020, 6021, 6023, 6024, 6025, 6026, 6027, 6030, 6031, 6035, 6036, 6037, 6038, 6039, 6040, 6043, 6048, 6054, 6066, 6068, 6069, 6070, 6080, 6092, 6094, 7026, 7033, 7034, 7035, 7036, 7038, 7039, 7045)
    AND lf.ACTUAL_DELIVERY_TS BETWEEN '2016-02-01 00:00:00' AND '2016-07-31 23:59:59'

LOAD_ID是唯一记录标识符,每个LOAD_ID具有不同的PO类型和与之关联的案例计数。我试图找到一种方法来根据PO类型和案例计数的一些标准聚合LOAD_ID行:

例如 - 如果PO_TYPE_CODE 20的CASE_QTY大于与该LOAD_ID关联的CASE_QTY总数的50%,则将LOAD_ID 30604179计为“TEST”

如果PO_TYPE_CODE 33的CASE_QTY大于与该LOAD_ID相关的总CASE_QTY的25%,则将LOAD_ID 30604179计为“TEST 2”

所需的输出:

请不要帮助或指导!!!!

enter image description here

3 个答案:

答案 0 :(得分:1)

您可以使用窗口功能来实现此功能。类似的东西:

Sum(Case_QTY) OVER (PARTITION BY LOAD_ID)

该窗口函数case_qty将汇总该load_id的中间结果集中的所有行并返回结果。除此之外,当前记录{{1}}为您提供了可以测试的百分比。

另外,我将case_qty转换为十进制,以确保除法的结果为十进制。否则它可能会舍入为整数...无法记住这将如何解决,因此可能没有必要。

答案 1 :(得分:0)

如果您的条件始终基于与LOAD_ID关联的CASE_QTY总数,我认为您需要的是加入的结果,以便您可以比较每个值,例如:

SELECT DISTINCT 
    ld.LOAD_ID, ... fsf.CASE_QTY, 100.00 * CASE_QTY/sum_caseQty AS loadCaseQtyPct,
    ...
FROM 
    us_trans_dm_vm.LOAD_DIM ld INNER JOIN
    us_trans_dm_vm.LOAD_FACT lf ON
        ld.LOAD_SK_ID = lf.LOAD_SK_ID INNER JOIN
    us_trans_dm_vm.FREIGHT_SHIPMENT_FACT fsf ON
     lf.LOAD_SK_ID = fsf.LOAD_SK_ID INNER JOIN  
    us_trans_dm_vm.FREIGHT_SHIPMENT_DIM fsd ON
     fsf.SHIPMENT_SK_ID = fsd.SHIPMENT_SK_ID INNER JOIN
     (SELECT
     loadid, SUM(case_qty) AS sum_caseQty
     FROM
     us_trans_dm_vm.LOAD_DIM
     GROUP BY 1) cq ON
     ld.loadid = cq.loadid
WHERE 
     fsd.CURRENT_IND = 'Y'
    AND ld.CURRENT_IND = 'Y'
    AND ld.DEST_LOC_ID IN (6006, 6009, 6010, 6011, 6012, 6016, 6017, 6018, 6019, 6020, 6021, 6023, 6024, 6025, 6026, 6027, 6030, 6031, 6035, 6036, 6037, 6038, 6039, 6040, 6043, 6048, 6054, 6066, 6068, 6069, 6070, 6080, 6092, 6094, 7026, 7033, 7034, 7035, 7036, 7038, 7039, 7045)
    AND lf.ACTUAL_DELIVERY_TS BETWEEN '2016-02-01 00:00:00' AND '2016-07-31 23:59:59'

答案 2 :(得分:0)

@JNevill

我根据所需的输出调整了窗口功能。我不再需要用上面提到的50%/ 25%标准来解决问题。相反,我使用window函数计算每行的Case_QTY百分比,与整个Load_ID的Case_Qty进行比较。我现在得到的结果如下图所示。

enter image description here

使用此SQL:

SELECT 
DISTINCT ld.LOAD_ID,lf.ACTUAL_DELIVERY_TS,ld.ORIG_LOC_ID,fsd.PO_TYPE_CODE,fsf.CASE_QTY,

CASE 
WHEN ORIG_LOC_ID IN (6903,6909,6912,7100,7101,7183,7184,7837,7840,7976) THEN 'Centerpoint'
WHEN ORIG_LOC_ID IN (6061,6088,6060,7042,7078,7085,7086,7084,7089,7093,7094,4892,7628) THEN 'Imports'
WHEN ORIG_LOC_ID IN (8092,8098,9153,9193,9195,9196) THEN 'Returns'
WHEN ORIG_LOC_ID IN (6005,6007,6008,6014,6022,6029,6041) THEN 'Fashion'
WHEN ORIG_LOC_ID IN (7006,7356,6280,8240,8103,7853,7005) THEN 'eCom'
WHEN ORIG_LOC_ID IN (6006, 6009, 6010, 6011, 6012, 6016, 6017, 6018, 6019, 6020, 6021, 6023, 6024, 6025, 6026, 6027, 6030, 6031, 6035, 6036, 6037, 6038, 6039, 6040, 6043, 6048, 6054, 6066, 6068, 6069, 6070, 6080, 6092, 6094, 7026, 7033, 7034, 7035, 7036, 7038, 7039, 7045) THEN 'Regional'
WHEN ORIG_LOC_ID IN (6042, 6047, 6055, 6056, 6057, 6059, 6062, 6064, 6065, 6071, 6072, 6073, 6074, 6077, 6082, 6083, 6084, 6085, 6090, 6091, 6095, 6096, 6097, 6099, 7010, 7012, 7013, 7014, 7015, 7016, 7017, 7018, 7019, 7021, 7023, 7024, 7025, 7030, 7047, 7048, 7053, 7055, 7068, 7070, 7077, 7079) THEN 'Grocery'
WHEN ORIG_LOC_TYPE_CODE IN ('VNDR') THEN 'VNDR'
WHEN ORIG_LOC_TYPE_CODE IN ('STORE') THEN 'STORE'
END AS LOADORIG, 

CASE
WHEN PO_TYPE_CODE IN (23,33,3,53,45,73,93) THEN 'DA'
WHEN PO_TYPE_CODE IN (20,22,28,29,40,42,50,83) THEN 'SS'
WHEN PO_TYPE_CODE IN (10,11,12,13,14,15,16,17,18,19) THEN 'Fashion'
WHEN PO_TYPE_CODE IN (43) THEN 'XDOCK'
END AS CHANNEL,


CASE 
WHEN Channel = 'DA' THEN  CAST (CASE_QTY AS DECIMAL(38,0))/SUM(CAST(CASE_QTY AS DECIMAL(38,2))) OVER (PARTITION BY ld.LOAD_ID)
WHEN Channel = 'SS' THEN  CAST (CASE_QTY AS DECIMAL(38,0))/SUM(CAST(CASE_QTY AS DECIMAL(38,2))) OVER (PARTITION BY ld.LOAD_ID)
END AS New_Column,

ld.DEST_LOC_ID

FROM us_trans_dm_vm.LOAD_DIM ld,
             us_trans_dm_vm.LOAD_FACT lf,
             us_trans_dm_vm.FREIGHT_SHIPMENT_FACT fsf,
             us_trans_dm_vm.FREIGHT_SHIPMENT_DIM fsd

WHERE ld.LOAD_SK_ID = lf.LOAD_SK_ID
AND lf.LOAD_SK_ID = fsf.LOAD_SK_ID
AND fsf.SHIPMENT_SK_ID = fsd.SHIPMENT_SK_ID
AND fsd.CURRENT_IND = 'Y'
AND ld.CURRENT_IND = 'Y'
AND ld.DEST_LOC_ID IN (6006)
AND lf.ACTUAL_DELIVERY_TS BETWEEN '2016-07-20 00:00:00' AND '2016-07-31 23:59:59'

GROUP BY ld.LOAD_ID,lf.ACTUAL_DELIVERY_TS,ld.ORIG_L

如您所见,“新列”显示每个Load_Id的正确百分比,所有百分比总和为100%。现在我想通过它们的频道和percenatges汇总所有唯一的Load_ID。

示例:

enter image description here

我一直试图使用类似的东西:

SUM(CASE WHEN Channel = 'DA' THEN Percentage ELSE 0 END) TotalDACases,
SUM(CASE WHEN Channel = 'SS' THEN Percentage ELSE 0 END) TotalStapleCases,

或者:

SUM(CASE WHEN Channel = 'DA' THEN  CAST (CASE_QTY AS DECIMAL(38,0))/SUM(CAST(CASE_QTY AS DECIMAL(38,2))) OVER (PARTITION BY ld.LOAD_ID) END) Test,
SUM (CASE WHEN Channel = 'SS' THEN  CAST (CASE_QTY AS DECIMAL(38,0))/SUM(CAST(CASE_QTY AS DECIMAL(38,2))) OVER (PARTITION BY ld.LOAD_ID) END) Test2,

但是我收到一个错误(“Ordered Analytical Functions无法嵌套”)并且我认为这与我有嵌套求和函数有关。

任何想法都会很棒!