在子查询

时间:2018-05-21 14:57:25

标签: sql sql-server join

我有两张桌子:

INVOICES
ID | DISCOUNT_PRC
1  |   NULL
2  |   0.10
3  |   0.70
...

INVOICE_ITEMS
ID | INVOICE_ID | PRICE | ALT_PRICE
1  |     1      |  100  |  0
2  |     1      |  200  |  150
3  |     2      |  400  |  300
4  |     2      |  200  |  0
5  |     2      |  100  |  NULL
6  |     3      |  200  |  40
7  |     3      |  100  |  NULL
...

注意:另一个应用程序正在使用该数据库,我不允许将零更改为NULL值,反之亦然。

我需要输出每张帐单的项目总和,每个项目的PRICE乘以折扣(1 - DISCOUNT_PRC),除非ALT_PRICE不是NULL并且大于零。在这种情况下,只需要ALT_PRICE。所以,期望的输出将是这样的:

INVOICES
ID | OVERALL_PRICE
1  |    250     (100*1) + (150)
2  |    570       (300) + (200*0.9) + (100*0.9)
3  |     70        (40) + (100*0.3)

到目前为止我所拥有的:

select I.ID,
     case when ISNULL( IT.ALT_PRICE, 0) > 0
          then IT.ALT_PRICE
          else IT.PRICE * (1 - ISNULL( I.DISCOUNT_PRC, 0))
          end AS OVERALL_PRICE
from INVOICES I
join INVOICE_ITEMS IT on IT.INVOICE_ID = I.ID

结果

ID  OVERALL_PRICE
1   100
1   150
2   300
2   180
2   90
3   40
3   30

结果对每个项目都有效,现在每个发票我需要SUM一行。我尝试使用LEFT JOIN,然后使用OUTER_APPLY

select I.ID, items.OVERALL_PRICE FROM INVOICES I
OUTER APPLY ( select sum (
     case when ISNULL( IT.ALT_PRICE, 0) > 0
          then IT.ALT_PRICE
          else IT.PRICE * (1 - ISNULL( I.DISCOUNT_PRC, 0))
          end) AS OVERALL_PRICE
from INVOICE_ITEMS IT
where IT.INVOICE_ID = I.ID
group by IT.INVOICE_ID ) as items

但是我得到了同样的错误:

Multiple columns are specified in an aggregated expression containing an outer reference. If an expression being aggregated contains an outer reference, then that outer reference must be the only column referenced in the expression.

修改 我还想避免在我的主查询列中使用SUM(或任何其他聚合表达式),因为这需要对所有其他列进行分组。实际的表有50多列和一些其他子查询,所以我想避免这种情况。

编辑2: 好的,我已经使用B3S(子查询中的加入),iSR5(OVER)和Gordon Linoff(APPLY)的解决方案进行了一些基准测试。我已经插入了50k发票和500k发票项目,并且我使用了MSSQL STATISTICS,它应该根据数据库大小显示足够的结果。结果如下:

Join within subquery:
- CPU time = 1170 ms,  elapsed time = 335 ms.
- CPU time = 1202 ms,  elapsed time = 344 ms.
- CPU time = 1153 ms,  elapsed time = 348 ms.

OVER:
- CPU time = 3089 ms,  elapsed time = 1361 ms.
- CPU time = 3010 ms,  elapsed time = 1075 ms.
- CPU time = 3010 ms,  elapsed time = 1070 ms.

APPLY:
- CPU time = 2496 ms,  elapsed time = 2320 ms.
- CPU time = 2433 ms,  elapsed time = 2171 ms.
- CPU time = 2496 ms,  elapsed time = 2179 ms.

结论: 我确实希望子查询中的连接能够通过SQL进行优化,但我仍然希望看到更好的结果与其他两个建议。这对我来说是一个惊喜,但我必须给予B3S信用(和接受的答案)。我确信内部连接会对性能造成打击,我没有费心去尝试。无论如何,不​​要犹豫从子查询中加入外表 - 如果需要的话,当然。

4 个答案:

答案 0 :(得分:2)

你非常接近:

select I.ID,
     sum(case when ISNULL( IT.ALT_PRICE, 0) > 0
          then IT.ALT_PRICE
          else IT.PRICE * (1 - ISNULL( I.DISCOUNT, 0))
          end) AS OVERALL_PRICE
from INVOICES I
join INVOICE_ITEMS IT on IT.INVOICE_ID = I.ID
GROUP BY I.ID

根据您的评论进行修改:

select DISTINCT
     I.TESTFIELD1,
     I.TESTFIELD2,
     I.ID,
     (SELECT SUM(case when ISNULL( IT2.ALT_PRICE, 0) > 0
          then IT2.ALT_PRICE
          else IT2.PRICE * (1 - ISNULL( I2.DISCOUNT_PRC, 0))
          end)
     FROM INVOICES I2
     LEFT JOIN INVOICE_ITEMS IT2 ON IT2.INVOICE_ID = I2.ID
     WHERE I2.ID = I.ID
     GROUP BY I2.ID) AS OVERALL_PRICE
from INVOICES I
join INVOICE_ITEMS IT on IT.INVOICE_ID = I.ID

我添加了一些测试字段,以向您展示如何使用sum并避免group by每个表字段。

SQL Fiddle Here

答案 1 :(得分:1)

在我看来,apply方法应该在没有group by的情况下运行。

select I.ID, items.OVERALL_PRICE
FROM INVOICES I OUTER APPLY
     (select sum(case when IT.ALT_PRICE > 0
                      then IT.ALT_PRICE
                      else IT.PRICE * (1 - coalesce( I.DISCOUNT_PRC, 0))
                 end) AS OVERALL_PRICE
      from INVOICE_ITEMS IT
      where IT.INVOICE_ID = I.ID
     ) items;

请注意,初始条件不需要NULL检查, because NULL will fails almost all comparisons. If ALT_PRICE is never negative or zero, you can simply use COALESCE()`:

select I.ID, items.OVERALL_PRICE
FROM INVOICES I OUTER APPLY
     (select sum(coalesce(IT.ALT_PRICE,
                          IT.PRICE * (1 - coalesce( I.DISCOUNT_PRC, 0))
                         )
                ) AS OVERALL_PRICE
      from INVOICE_ITEMS IT
      where IT.INVOICE_ID = I.ID
     ) as items;

但它并不在SQL Server中。我不确定为什么SQL Server对外部引用有这种限制。这看起来很奇怪。

在这种情况下,您可以将逻辑重写为:

select I.ID, items.OVERALL_PRICE
from INVOICES I outer apply
     (select (sum(case when IT.ALT_PRICE > 0 then IT.ALT_PRICE else 0 END) +
              sum(case when IT.ALT_PRICE = 0 OR IT.ALT_PRICE IS NULL
                       then IT.PRICE else 0
                  end) * (1 - coalesce( I.DISCOUNT_PRC, 0))
             ) AS OVERALL_PRICE
      from INVOICE_ITEMS IT
      where IT.INVOICE_ID = I.ID
     ) items;

这不太令人满意,但您可以使用apply

答案 2 :(得分:1)

另一种方法是利用OVER(),这将避免您使用GROUP BY

你可以这样做:

SELECT DISTINCT
    Inv.ID,
    SUM(CASE
            WHEN items.ALT_Price IS NOT NULL AND items.ALT_Price > 0 
            THEN items.ALT_Price
            ELSE items.Price * (1 - ISNULL(DISCOUNT_PRC, 0 ))
    END) OVER(PARTITION BY Inv.ID ORDER BY Inv.ID) AS OVERALL_PRICE
FROM #Invoices Inv
JOIN #Invoices_Items items ON items.Inovice_ID = Inv.ID

然后,您可以添加列而不将其括在GROUP BY

您也可以将其用作子查询:

SELECT *
FROM(
SELECT DISTINCT
    Inv.ID,
    SUM(CASE
            WHEN items.ALT_Price IS NOT NULL AND items.ALT_Price > 0 
            THEN items.ALT_Price
            ELSE items.Price * (1 - ISNULL(DISCOUNT_PRC, 0 ))
    END) OVER(PARTITION BY Inv.ID ORDER BY Inv.ID) AS OVERALL_PRICE
FROM #Invoices Inv
JOIN #Invoices_Items items ON items.Inovice_ID = Inv.ID
) D 
-- extend it with more filters, JOINs ..etc

答案 3 :(得分:0)

以下是我选择的方法(待定性能分析与其他方法相比):

select i.ID
     , sum(coalesce(alt_price,price*(1-isnull(discount_prc,0)))) OVERALL_PRICE
  from invoices i
  join (select id
             , invoice_id
             , price
             , case alt_price when 0 then null else alt_price end alt_price
          from invoice_items) ii
    on i.id = ii.invoice_id
 group by i.id

在此查询中,alt_price的零首先转换为NULL,然后在coalesce内使用sum来挑选alt_price或打折price