我现在正在使用带有示例数据库的mssql" adventureworks 2014",这里我遇到了一些关于join和sum的问题,这里是我使用的两个表:
CREATE TABLE PurchaseOrderHeader(
PurchaseOrderID INTEGER NOT NULL PRIMARY KEY
,VendorID INTEGER NOT NULL
,OrderDate VARCHAR(23) NOT NULL
,TotalDue NUMERIC(10,4) NOT NULL
);
INSERT INTO PurchaseOrderHeader(PurchaseOrderID,VendorID,OrderDate,TotalDue) VALUES (1,1580,'2011-04-16 00:00:00.000',222.1492);
INSERT INTO PurchaseOrderHeader(PurchaseOrderID,VendorID,OrderDate,TotalDue) VALUES (2,1496,'2011-04-16 00:00:00.000',300.6721);
INSERT INTO PurchaseOrderHeader(PurchaseOrderID,VendorID,OrderDate,TotalDue) VALUES (3,1494,'2011-04-16 00:00:00.000',9776.2665);
INSERT INTO PurchaseOrderHeader(PurchaseOrderID,VendorID,OrderDate,TotalDue) VALUES (4,1650,'2011-04-16 00:00:00.000',189.0395);
INSERT INTO PurchaseOrderHeader(PurchaseOrderID,VendorID,OrderDate,TotalDue) VALUES (5,1654,'2011-04-30 00:00:00.000',22539.0165);
INSERT INTO PurchaseOrderHeader(PurchaseOrderID,VendorID,OrderDate,TotalDue) VALUES (6,1664,'2011-04-30 00:00:00.000',16164.0229);
INSERT INTO PurchaseOrderHeader(PurchaseOrderID,VendorID,OrderDate,TotalDue) VALUES (7,1678,'2011-04-30 00:00:00.000',64847.5328);
CREATE TABLE PurchaseOrderDetail(
PurchaseOrderID INTEGER NOT NULL
,PurchaseOrderDetailID INTEGER NOT NULL PRIMARY KEY
,OrderQty INTEGER NOT NULL
,ProductID INTEGER NOT NULL
);
INSERT INTO PurchaseOrderDetail(PurchaseOrderID,PurchaseOrderDetailID,OrderQty,ProductID) VALUES (1,1,4,1);
INSERT INTO PurchaseOrderDetail(PurchaseOrderID,PurchaseOrderDetailID,OrderQty,ProductID) VALUES (2,2,3,359);
INSERT INTO PurchaseOrderDetail(PurchaseOrderID,PurchaseOrderDetailID,OrderQty,ProductID) VALUES (2,3,3,360);
INSERT INTO PurchaseOrderDetail(PurchaseOrderID,PurchaseOrderDetailID,OrderQty,ProductID) VALUES (3,4,550,530);
INSERT INTO PurchaseOrderDetail(PurchaseOrderID,PurchaseOrderDetailID,OrderQty,ProductID) VALUES (4,5,3,4);
INSERT INTO PurchaseOrderDetail(PurchaseOrderID,PurchaseOrderDetailID,OrderQty,ProductID) VALUES (5,6,550,512);
INSERT INTO PurchaseOrderDetail(PurchaseOrderID,PurchaseOrderDetailID,OrderQty,ProductID) VALUES (6,7,550,513);
INSERT INTO PurchaseOrderDetail(PurchaseOrderID,PurchaseOrderDetailID,OrderQty,ProductID) VALUES (7,8,550,317);
INSERT INTO PurchaseOrderDetail(PurchaseOrderID,PurchaseOrderDetailID,OrderQty,ProductID) VALUES (7,9,550,318);
INSERT INTO PurchaseOrderDetail(PurchaseOrderID,PurchaseOrderDetailID,OrderQty,ProductID) VALUES (7,10,550,319);
这是sql脚本:
select PurchaseOrderHeader.VendorID,
SUM(CASE WHEN Datename(year,PurchaseOrderHeader.OrderDate) = 2011 THEN PurchaseOrderHeader.TotalDue else 0 END) as "TotalPay IN 2011",
SUM(CASE WHEN Datename(year,PurchaseOrderHeader.OrderDate) = 2011 THEN PurchaseOrderDetail.OrderQty else 0 END) as "TotalOrder IN 2011"
from PurchaseOrderHeader
left join PurchaseOrderDetail on PurchaseOrderHeader.PurchaseOrderID = PurchaseOrderDetail.PurchaseOrderID
group by PurchaseOrderHeader.VendorID
order by VendorID
这是我的代码:
VendorID TotalPay IN 2011 TotalOrder IN 2011
1494 9776.2665 550
1496 601.3442 6
1580 222.1492 4
1650 189.0395 3
1654 22539.0165 550
1664 16164.0229 550
1678 194542.5984 1650
这是我得到的:
VendorID TotalPay IN 2011 TotalOrder IN 2011
1494 9776.2665 550
1496 300.6721 6
1580 222.1492 4
1650 189.0395 3
1654 22539.0165 550
1664 16164.0229 550
1678 64847.5328 1650
虽然我应该期待:
{{1}}
此代码将在PurchaseOrderID上连接两个表,并计算按vendorID分组的TotalDue。问题是当我使用join时,表PurchaseOrderDetail中的多行将引用表PurchaseOrderHeader中的一行。在供应商1496和1678的此示例中,有两行或三行引用PurchaseDetailHeader中的一行。所以它会被添加两到三次。我该怎样避免多次添加,谢谢!
答案 0 :(得分:1)
你可以把你的SUM除以COUNT。这样的事情。
select PurchaseOrderHeader.VendorID,
SUM(CASE WHEN Datename(year,PurchaseOrderHeader.OrderDate) = 2011 THEN PurchaseOrderHeader.TotalDue else 0 END) / COUNT(*) as "TotalPay IN 2011",
SUM(CASE WHEN Datename(year,PurchaseOrderHeader.OrderDate) = 2011 THEN PurchaseOrderDetail.OrderQty else 0 END) / COUNT(*) as "TotalOrder IN 2011"
from Purchasing.PurchaseOrderHeader
left join Purchasing.PurchaseOrderDetail on PurchaseOrderHeader.PurchaseOrderID = PurchaseOrderDetail.PurchaseOrderID
group by PurchaseOrderHeader.VendorID
order by VendorID
答案 1 :(得分:0)
select h.VendorID,
SUM(CASE WHEN Datename(year,h.OrderDate) = 2011 THEN h.TotalDue else 0 END) as "TotalPay IN 2011",
SUM(CASE WHEN Datename(year,h.OrderDate) = 2011 THEN d.OrderQty else 0 END) as "TotalOrder IN 2011"
from PurchaseOrderHeader h
left join (
select t.PurchaseOrderID,
sum(t.OrderQty) as OrderQty
from PurchaseOrderDetail t
group by t.PurchaseOrderID
) d on d.PurchaseOrderID = h.PurchaseOrderID
group by h.VendorID
order by VendorID
答案 2 :(得分:0)
避免重复计算的默认方法是使用SUM(DISTINCT expr)
。
这并不总是运行得很好,因为您不想对不同的值求和,而是希望对不同的行求和,即使这些行共享相同的值。
解决方案是使用子查询对订单号的详细信息求和,然后加入结果。然后,每个订单ID只有一个要加入订单行:
SELECT PurchaseOrderHeader.VendorID,
SUM(PurchaseOrderHeader.TotalDue) AS "TotalPay IN 2011",
SUM(POD.Qty) AS "TotalOrder IN 2011"
FROM PurchaseOrderHeader
LEFT JOIN (
SELECT PurchaseOrderDetail.PurchaseOrderID, SUM(OrderQty) AS Qty
FROM PurchaseOrderDetail
GROUP BY PurchaseOrderDetail.PurchaseOrderID
) AS POD on PurchaseOrderHeader.PurchaseOrderID = POD.PurchaseOrderID
WHERE Datename(year,PurchaseOrderHeader.OrderDate) = 2011
GROUP BY PurchaseOrderHeader.VendorID
ORDER BY VendorID
此外,我还可以自由地从查询的CASE WHEN
到SUM()
部分删除WHERE
语句。在这种情况下,应该使用较短的代码给出相同的结果。
答案 3 :(得分:0)
很多好的答案,但我认为他们错过了供应商可能有多个采购订单的位置,并且抛出了TotalOrder的计算方式。 (尝试一个包含多个订单的多个供应商的示例,每个订单都有多个详细信息。)不要忘记检查可能的NULL值!
在这里,我使用子查询来计算相关年份的每个供应商的TotalPay,然后将其加入到所有供应商的列表中。 (为了易读性,也提入表别名。)
-- As a subquery
SELECT
hd.VendorID,
,sum(case
when year(hd.OrderDate) = 2011 then hd.TotalDue
else 0
end) as "TotalPay IN 2011"
,isnull(subQuery.TotaOrderIn2011, 0) as "TotalOrder IN 2011"
from PurchaseOrderHeader hd
left join (-- Calculate volume by vendor for 2011
select
hd.VendorID
,sum(OrderQty) TotalOrderIn2011
from PurchaseOrderHeader hd
inner join PurchaseOrderDetail dt
on hd.PurchaseOrderID = dt.PurchaseOrderID
where year(hd.OrderDate) = 2011
group by
hd.VendorID
) subQuery
on subQuery.VendorId = hd.VendorId
group by hd.VendorID
order by hd.VendorID