如何使用CTE将两个表分组

时间:2018-06-03 02:43:40

标签: sql oracle group-by common-table-expression

我有三个表(Oracle):

sales_order
-------------
int so_key (pk)
int part_key (fk)
int condition_key (fk)
number unit_price
int qty_ordered
number unit_cost
date entry_date

quote
-------------
int q_key (pk) 
int part_key (fk)
int condition_key (fk)
number unit_price
int qty_quoted
date entry_date

stock
-------------
int stock_key (pk)
int part_key (fk)
int condition (fk)
int qty_available
number unit_cost

这三个都有这两个表的外键引用:

part
-------------
int part_key (pk) 

condition
-------------
int condition_key (pk)

我正在撰写一个查询,将数据汇总到按partcondition分组的行中。但是,我无法弄清楚如何按partcondition进行分组。以下是我仅按part分组的(功能)查询:

WITH
    ctePart_Quotes AS
    (
        SELECT q.part_key
            , COUNT(*) AS quotes_count
            , SUM(q.unit_price * q.qty_quoted) AS quotes_amt_total
            , SUM(q.qty_quoted) AS quotes_qty_total
        FROM quote q
        WHERE q.entry_date BETWEEN TO_DATE('01-Jan-2011', 'dd-mm-yyyy') AND TO_DATE('31-Dec-2011', 'dd-mm-yyyy')
        GROUP BY q.part_key
    )
    , ctePart_Sales AS
    (
        SELECT so.part_key
            , COUNT(*) AS sales_count
            , SUM(so.unit_price * so.qty_ordered) AS sales_amt_total
            , SUM(so.qty_ordered) AS sales_qty_total
            , SUM(so.qty_ordered * so.unit_cost) AS cost_total
        FROM sales_order so
        WHERE so.entry_date BETWEEN TO_DATE('01-Jan-2011', 'dd-mm-yyyy') AND TO_DATE('31-Dec-2011', 'dd-mm-yyyy')
        GROUP BY so.part_key
    )
    , ctePart_Stock AS
    (
        SELECT stm.part_key
            , SUM(stm.qty_available) AS total_available
            , SUM(stm.qty_available * stm.unit_cost) AS inv_cost
        FROM stock stm
        GROUP BY stm.part_key
    )
SELECT p.part_key,
        part_stock.total_available,
        part_stock.inv_cost,
        sales.sales_amt_total,
        sales.sales_qty_total,
        sales.sales_count,
        sales.cost_total,
        quotes.quotes_amt_total,
        quotes.quotes_qty_total,
        quotes.quotes_count
  FROM parts p
  LEFT OUTER JOIN ctePart_Quotes quotes
    ON quotes.part_key = p.part_key
  LEFT OUTER JOIN ctePart_Sales sales
    ON sales.part_key = p.part_key
  LEFT OUTER JOIN ctePart_Stock part_stock
    ON part_stock.part_key = p.part_key

WHERE NOT(sales_amt_total IS NULL
        AND sales_qty_total IS NULL
        AND sales_count IS NULL
        AND cost_total IS NULL
        AND quotes_amt_total IS NULL
        AND quotes_qty_total IS NULL
        AND quotes_count IS NULL)
    AND SALES_AMT_TOTAL > 10000

此查询生成此输出(按part_key分组的总计):

part_key | total_available | inv_cost | sales_amt_total | ...
---------|-----------------|----------|-----------------| ...
     234 |              59 |  4923.90 |        29403.48 | ...
     185 |              21 |   192.64 |         9034.95 | ...
     102 |             102 |  8738.34 |        50382.20 | ...
...

但我正在尝试修改查询以生成此项(总计按part_key和condition_key分组):

part_key | condition_key | total_available | inv_cost | sales_amt_total | ...
---------|---------------|-----------------|----------|-----------------| ...
     234 |             3 |              24 |  2360.50 |        16947.18 | ...
     234 |             7 |              35 |  2563.40 |        12456.30 | ...
...

你是怎么做到的?

编辑:澄清: 复杂性在于:你如何join最终选择中的条件?因为您选择FROM part,但condition关系是通过其他表(sales_order等)。因此,您必须join通过每个表(LEFT OUTER JOIN condition cond ON quotes.condition_key = cond.condition_key等),但这些联接都是单独的列。

编辑#2:有人提供了数据模型的良好形象,说明了部分/条件之间的(正确的,合法的)关系,但也解决了这个问题所面临的微妙复杂性:

enter image description here

3 个答案:

答案 0 :(得分:2)

这里的主要问题似乎是您的数据模型。将您的问题的表“描述”转换为DDL代码,并将其转换为关系模型(使用Oracle Datamodeler),我们发现如下所示:

DDL代码

create table part ( part_key number primary key ) ;
create table condition ( condition_key number primary key ) ;

create table sales_order (
  so_key number generated always as identity start with 3000 primary key 
, part_key number references part 
, condition_key number references condition
, unit_price number
, qty_ordered number
, unit_cost number
, entry_date date ) ;

create table quote (
  q_key number generated always as identity start with 4000  primary key
, part_key number references part
, condition_key number references condition
, qty_quoted number
, unit_price number
, entry_date date );

create table stock (
  stock_key number generated always as identity start with 5000  primary key
, part_key number references part
, condition_key number references condition
, qty_available number 
, unit_cost number ) ;

关系模型(Oracle SQL Developer Data Modeler)

enter image description here

观察模型,很明显每个PART都有几个条件。因此,可能有必要(由您)决定您所指的是哪种情况。这可能并不容易。假设我们有一个部分(带有part_key)1000。现在,我们可以记录3个不同的条件,并为查询中提到的3个表中的每个表使用特定条件。

-- one part, 3 conditions
begin
  insert into part ( part_key ) values ( 1000 ) ;
  insert into condition( condition_key ) values ( 2001 ) ;
  insert into condition( condition_key ) values ( 2002 ) ;
  insert into condition( condition_key ) values ( 2003 ) ;
  insert into sales_order ( part_key, condition_key ) values ( 1000, 2001 ) ;
  insert into quote ( part_key, condition_key ) values ( 1000, 2002 ) ;
  insert into stock ( part_key, condition_key ) values ( 1000, 2003 ) ;
end ;
/

查询中应该使用3个条件中的哪一个?很难说。

-- not using WITH (subquery factoring) here - for clarity 
select
  P.part_key
, SO.condition_key
, Q.condition_key
, ST.condition_key
from part P
  join sales_order SO on SO.part_key = P.part_key 
  join quote Q        on Q.part_key  = P.part_key 
  join stock ST       on ST.part_key = P.part_key 
;

-- output
PART_KEY  CONDITION_KEY  CONDITION_KEY  CONDITION_KEY  
1000      2001           2002           2003   

嗯 - 我们可以选择其中一个条件,我们不能。但是,同一部分可能存在更多条件......

begin
  insert into condition( condition_key ) values ( 2004 ) ;
  insert into condition( condition_key ) values ( 2005 ) ;
  insert into condition( condition_key ) values ( 2006 ) ;
  insert into sales_order ( part_key, condition_key ) values ( 1000, 2004 ) ;
  insert into quote ( part_key, condition_key ) values ( 1000, 2005 ) ;
  insert into stock ( part_key, condition_key ) values ( 1000, 2006 ) ;
end ;
/

-- Same query as above now gives us:
PART_KEY  CONDITION_KEY  CONDITION_KEY  CONDITION_KEY  
1000      2001           2005           2006           
1000      2001           2005           2003           
1000      2001           2002           2006           
1000      2001           2002           2003           
1000      2004           2005           2006           
1000      2004           2005           2003           
1000      2004           2002           2006           
1000      2004           2002           2003 

结论:修复您的数据模型。 (我们知道这有时说起来容易做起来难......)然后,对你的查询做更多的工作是有意义的。

<强> __更新__

既然我们知道无法对表和约束做任何事情,那么以下查询可能会给你一个起点。我们没有适当的测试数据,所以我们只需在表格中添加一些随机值...

-- PART and CONDITION -> 1000 integers each
begin
  for i in 1 .. 1000
  loop
    insert into part ( part_key ) values ( i ) ;
    insert into condition( condition_key ) values ( i ) ;
  end loop;
end ;
/

表格引用

-- 2 12s, 2 18s
SQL> select * from quote ;
Q_KEY  PART_KEY  CONDITION_KEY  QTY_QUOTED  UNIT_PRICE  ENTRY_DATE  
4000   10        100            55          500         01-MAY-11   
4001   12        120            55          500         01-MAY-11   
4002   12        37             56          501         01-MAY-11   
4003   14        140            55          500         01-MAY-11   
4004   15        46             56          501         01-MAY-11   
4005   16        160            55          500         01-MAY-11   
4006   18        180            55          500         01-MAY-11   
4007   18        55             56          501         01-MAY-11   
4008   20        200            55          500         01-MAY-11 

表SALES_ORDER

SQL> select * from sales_order ;

SO_KEY  PART_KEY  CONDITION_KEY  UNIT_PRICE  QTY_ORDERED  UNIT_COST  ENTRY_DATE  
3000    10        100            500         55           400        05-MAY-11   
3001    12        120            500         55           400        05-MAY-11   
3002    14        140            500         55           400        05-MAY-11   
3003    16        160            500         55           400        05-MAY-11   
3004    18        180            500         55           400        05-MAY-11   
3005    20        200            500         55           400        05-MAY-11

表格

SQL> select * from stock ;
STOCK_KEY  PART_KEY  CONDITION_KEY  QTY_AVAILABLE  UNIT_COST  
5000       10        100            10             400        
5001       12        120            10             400        
5002       14        140            10             400        
5003       14        100            12             402        
5004       16        160            10             400        
5005       18        180            10             400        
5006       20        200            10             400     

假设只记录有效的声部/条件组合,我们可以使用FULL OUTER JOIN来获得第一张照片。

SQL> select 
  2    Q.part_key  q_part , Q.condition_key  q_cond
  3  , SO.part_key so_part, SO.condition_key so_cond
  4  , ST.part_key st_part, ST.condition_key st_cond
  5  from quote Q 
  6    full join sales_order SO
  7      on SO.part_key = Q.part_key and SO.condition_key = Q.condition_key
  8    full join stock ST  
  9      on ST.part_key = SO.part_key and ST.condition_key = SO.condition_key
 10  ;

-- result
Q_PART  Q_COND  SO_PART  SO_COND  ST_PART  ST_COND  
10      100     10       100      10       100      
12      120     12       120      12       120      
12      37      NULL     NULL     NULL     NULL     
14      140     14       140      14       140      
15      46      NULL     NULL     NULL     NULL     
16      160     16       160      16       160      
18      180     18       180      18       180      
18      55      NULL     NULL     NULL     NULL     
20      200     20       200      20       200      
NULL    NULL    NULL     NULL     14       100

然后,我们可以使用分析函数进行各种计算。请注意,我们这里不使用GROUP BY,分组是通过Q.part_key,Q.condition_key ...分区完成的(更多关于分析函数:Oracle documentation和示例here )。

-- Skeleton query ...
-- Note that you will have need to write over(...) several times.
-- Add a WHERE clause and conditions as required.
select
  Q.part_key as q_part, Q.condition_key as q_cond, 
  count( Q.part_key ) over ( partition by Q.part_key, Q.condition_key ) as q_count
-- 
-- Q example sums
-- , sum( Q.unit_price * Q.qty_quoted ) 
--     over ( partition by Q.part_key, Q.condition_key ) as qat -- quotes_amt_total
-- , sum( Q.qty_quoted ) 
--    over ( partition by Q.part_key, Q.condition_key ) as qqt -- quotes_qty_total
-- 
, SO.part_key as so_part, SO.condition_key as so_cond
, count( SO.part_key ) over ( partition by SO.part_key, SO.condition_key ) as so_count
-- 
-- SO sums here
--
, ST.part_key as st_part, ST.condition_key as st_cond
, count( ST.part_key ) over ( partition by ST.part_key, ST.condition_key ) as st_count
from sales_order SO 
  full join quote Q  
    on SO.part_key = Q.part_key and SO.condition_key = Q.condition_key
  full join stock ST  
    on ST.part_key = SO.part_key and ST.condition_key = SO.condition_key
-- where ...
; 

<强>结果

-- output
Q_PART  Q_COND  Q_COUNT  SO_PART  SO_COND  SO_COUNT  ST_PART  ST_COND  ST_COUNT  
10      100     1        10       100      1         10       100      1         
12      37      1        NULL     NULL     0         NULL     NULL     0         
12      120     1        12       120      1         12       120      1         
14      140     1        14       140      1         14       140      1         
15      46      1        NULL     NULL     0         NULL     NULL     0         
16      160     1        16       160      1         16       160      1         
18      55      1        NULL     NULL     0         NULL     NULL     0         
18      180     1        18       180      1         18       180      1         
20      200     1        20       200      1         20       200      1         
NULL    NULL    0        NULL     NULL     0         14       100      1 

答案 1 :(得分:1)

诀窍是首先创建一个Carthesian产品(condition表只有~30行),可能以后抑制不需要的结果行:

这可能看起来不太理想,但它会避免COALESCE() d关键字段上的连接,这可能会表现不佳。

WITH
    ctePart_Quotes AS
    (
        SELECT q.part_key, q.condition_key
            , COUNT(*) AS quotes_count
            , SUM(q.unit_price * q.qty_quoted) AS quotes_amt_total
            , SUM(q.qty_quoted) AS quotes_qty_total
        FROM quote q
        WHERE q.entry_date BETWEEN TO_DATE('01-Jan-2011', 'dd-mm-yyyy') AND TO_DATE('31-Dec-2011', 'dd-mm-yyyy')
        GROUP BY q.part_key, q.condition_key
    )
    , ctePart_Sales AS
    (
        SELECT so.part_key, so.condition_key
            , COUNT(*) AS sales_count
            , SUM(so.unit_price * so.qty_ordered) AS sales_amt_total
            , SUM(so.qty_ordered) AS sales_qty_total
            , SUM(so.qty_ordered * so.unit_cost) AS cost_total
        FROM sales_order so
        WHERE so.entry_date BETWEEN TO_DATE('01-Jan-2011', 'dd-mm-yyyy') AND TO_DATE('31-Dec-2011', 'dd-mm-yyyy')
        GROUP BY so.part_key, so.condition_key
    )
    , ctePart_Stock AS
    (
        SELECT stm.part_key, stm.condition_key
            , SUM(stm.qty_available) AS total_available
            , SUM(stm.qty_available * stm.unit_cost) AS inv_cost
        FROM stock stm
        GROUP BY stm.part_key, stm.condition_key
    )

SELECT p.part_key,
       c.condition_key,
        part_stock.total_available,
        part_stock.inv_cost,
        sales.sales_amt_total,
        sales.sales_qty_total,
        sales.sales_count,
        sales.cost_total,
        quotes.quotes_amt_total,
        quotes.quotes_qty_total,
        quotes.quotes_count
  FROM parts p
  CROSS JOIN condition c -- <<-- Here
  LEFT OUTER JOIN ctePart_Quotes quotes
    ON quotes.part_key = p.part_key
    AND quotes.condition_key = c.condition_key -- <<-- Here
  LEFT OUTER JOIN ctePart_Sales sales
    ON sales.part_key = p.part_key
    AND sales.condition_key = c.condition_key -- <<-- Here
  LEFT OUTER JOIN ctePart_Stock part_stock
    ON part_stock.part_key = p.part_key
    AND part_stock.condition_key = c.condition_key -- <<-- Here

WHERE NOT(sales_amt_total IS NULL
        AND sales_qty_total IS NULL
        AND sales_count IS NULL
        AND cost_total IS NULL
        AND quotes_amt_total IS NULL
        AND quotes_qty_total IS NULL
        AND quotes_count IS NULL) -- <<-- And maybe Here, too
    AND SALES_AMT_TOTAL > 10000
    ;

答案 2 :(得分:0)

也许这适合你:

WITH
    ctePart_Quotes AS
    (
        SELECT q.part_key,
               q.condition_key
            , COUNT(*) AS quotes_count
            , SUM(q.unit_price * q.qty_quoted) AS quotes_amt_total
            , SUM(q.qty_quoted) AS quotes_qty_total
        FROM quote q
        WHERE q.entry_date BETWEEN TO_DATE('01-Jan-2011', 'dd-mm-yyyy') AND TO_DATE('31-Dec-2011', 'dd-mm-yyyy')
        GROUP BY q.part_key,
                 q.condition_key
    )
    , ctePart_Sales AS
    (
        SELECT so.part_key,
               so.condition_key
            , COUNT(*) AS sales_count
            , SUM(so.unit_price * so.qty_ordered) AS sales_amt_total
            , SUM(so.qty_ordered) AS sales_qty_total
            , SUM(so.qty_ordered * so.unit_cost) AS cost_total
        FROM sales_order so
        WHERE so.entry_date BETWEEN TO_DATE('01-Jan-2011', 'dd-mm-yyyy') AND TO_DATE('31-Dec-2011', 'dd-mm-yyyy')
        GROUP BY so.part_key,
                 so.condition_key
    )
    , ctePart_Stock AS
    (
        SELECT stm.part_key,
               stm.condition_key
            , SUM(stm.qty_available) AS total_available
            , SUM(stm.qty_available * stm.unit_cost) AS inv_cost
        FROM stock stm
        GROUP BY stm.part_key,
                 stm.condition_key
    )
SELECT p.part_key,
        cte.condition_key,
        cte.total_available,
        cte.inv_cost,
        cte.sales_amt_total,
        cte.sales_qty_total,
        cte.sales_count,
        cte.cost_total,
        cte.quotes_amt_total,
        cte.quotes_qty_total,
        cte.quotes_count
  FROM parts p
  LEFT OUTER JOIN (SELECT coalesce(quotes.part_key, sales.part_key, part_stock.part_key) part_key,
                          coalesce(quotes.condition_key, sales.condition_key, part_stock.condition_key) condition_key,
                          quotes.quotes_count,
                          quotes.quotes_amt_total,
                          quotes.quotes_qty_total,
                          sales.sales_count,
                          sales.sales_amt_total,
                          sales.sales_qty_total,
                          sales.cost_total,
                          part_stock.total_available,
                          part_stock.inv_cost
                          FROM ctePart_Quotes quotes
                               FULL JOIN ctePart_Sales sales
                                         ON sales.part_key = quotes.part_key
                                            AND sales.condition_key = quotes.condition_key
                               FULL JOIN ctePart_Stock part_stock
                                         ON part_stock.part_key = sales.part_key
                                            AND part_stock.condition_key = sales.condition_key) cte
    ON cte.part_key = p.part_key
WHERE NOT(sales_amt_total IS NULL
        AND cte.sales_qty_total IS NULL
        AND cte.sales_count IS NULL
        AND cte.cost_total IS NULL
        AND cte.quotes_amt_total IS NULL
        AND cte.quotes_qty_total IS NULL
        AND cte.quotes_count IS NULL)
    AND SALES_AMT_TOTAL > 10000;

它还在CTE中按condition_key进行分组。然后使用FULL JOIN将CTE coalesce放在一起,以补偿第一个表格中nullpart_key的{​​{1}}值(但也许没有,如果其中一个表中存在的condition_keypart_key的每个组合也存在于相应的两个表中。)。然后使用condition_key将结果LEFT JOIN添加到part