从两个查询中获取组合差异(不常见/“相交”记录)

时间:2013-09-18 21:51:55

标签: sql

数据库 - Microsoft Adventureworks

表格 Sales.SalesOrderHeader

问题哪些客户(即customerID)在2003年3月或2003年4月订购了某些商品,但不是两者都订购了。

概念 -

获取蓝色部分,即元素/行唯一到A,唯一到B. Image Credits - Jeff Atwood of CodingHorror.com

我的查询

select Soh.CustomerID, Soh.OrderDate 
from Sales.SalesOrderHeader as Soh
where Soh.OrderDate >= '2003-03-01' AND Soh.OrderDate < '2003-04-01' -- march only

UNION

select Soh.CustomerID, Soh.OrderDate 
from Sales.SalesOrderHeader as Soh
where Soh.OrderDate >= '2003-04-01' AND Soh.OrderDate < '2003-05-01' -- april only    
order by Soh.OrderDate asc;

我的问题 -

我是否正确回答了问题? 解决这个问题的任何其他方法,最好是少量的代码?

编辑 - 哎呀。这只会在2个月内发出订单而不回答问题。所以,我错了。试图解决它。

感谢。

4 个答案:

答案 0 :(得分:3)

SELECT  Cust.CustomerID
FROM    Sales.Customer AS Cust
        INNER JOIN Sales.SalesOrderHeader AS Soh
            ON Cust.CustomerID = Soh.CustomerID
WHERE   Soh.OrderDate >= '2003-03-01' AND Soh.OrderDate < '2003-05-01'
GROUP   BY Cust.CustomerID
HAVING  COUNT(DISTINCT CASE WHEN MONTH(Soh.OrderDate) = 3 AND
                                YEAR(Soh.OrderDate) = 2003 THEN 1 ELSE 2 END) = 1

好吧,SELECTFROMWHEREGROUP BY非常自我解释。这里棘手的部分是HAVING子句。所以让我简化它,你看到的CASE语句给出了一个值,该值将记录分类为一个组。我将使用12代替MarchApril,以使其更易理解。

SELECT  CustomerID,
        CASE WHEN MONTH(OrderDate) = 3 AND YEAR(OrderDate) = 2003 
              THEN 'March' 
              ELSE 'April'
        END AS MonthBought
FROM    TableName
WHERE   OrderDate >= '2003-03-01' AND OrderDate < '2003-05-01'

正如您在演示中所看到的,当订单日期在任何日期的March, 2013月份时,MonthBought的对应值为March,其他明智的{{} 1}}因为我们确信由于April子句,所有记录都介于March and April 2013之间。

WHERE子句会过滤HAVING中唯一值为MonthBought的所有记录,这意味着客户仅在某个月购买。

答案 1 :(得分:1)

CREATE TABLE table_a ( id INTEGER NOT NULL PRIMARY KEY
   , OrderDate DATE NOT NULL DEFAULT '2003-03-15');
CREATE TABLE table_b ( id INTEGER NOT NULL PRIMARY KEY
   , OrderDate DATE NOT NULL DEFAULT '2003-04-15');

INSERT INTO table_a(id) VALUES (0),(2),(4),(6),(8),(10),(12),(14),(16),(18),(20);
INSERT INTO table_b(id) VALUES (0),(3),(6),(9),(12),(15),(18),(21);


SELECT COALESCE (a.id, b.id) AS id
FROM (
        SELECT DISTINCT id
        FROM table_a
        WHERE OrderDate >= '2003-03-01' AND OrderDate < '2003-04-01'
        ) a
FULL OUTER JOIN (
        SELECT DISTINCT id
        FROM table_b
        WHERE OrderDate >= '2003-04-01' AND OrderDate < '2003-05-01'
        )  b ON b.id = a.id
WHERE a.id IS NULL OR b.id IS NULL
        ;

注意:我必须创建自己的数据,因为OP没有提供任何数据,而且我懒得输入它。

UPDATE:原始的UNION查询(这里使用table_a / table_b构造,对于原始数据模型,使用table_a = table_b = Sales.SalesOrderHeader

SELECT a.id, a.OrderDate
FROM table_a as a
WHERE a.OrderDate >= '2003-03-01' AND a.OrderDate < '2003-04-01' -- march only
AND NOT EXISTS (
        SELECT * FROM table_b nx
        WHERE nx.id = a.id
        AND nx.OrderDate >= '2003-04-01' AND nx.OrderDate < '2003-05-01' -- april only
        )
UNION ALL
SELECT b.id, b.OrderDate
FROM table_b as b
WHERE b.OrderDate >= '2003-04-01' AND b.OrderDate < '2003-05-01' -- april only    
AND NOT EXISTS (
        SELECT * FROM table_a nx
        WHERE nx.id = b.id
        AND nx.OrderDate >= '2003-03-01' AND nx.OrderDate < '2003-04-01' -- march only
        )
ORDER BY OrderDate ASC;

注意:

  • UNION应该是UNION ALL,因为重复是不可能的,不必删除
  • NOT EXISTS ()条款是必要的:您希望3月份的记录在4月不存在,反之亦然。
  • UNION的需求通常表示次优数据模型(在这种情况下不是)
  • FULL OUTER JOIN可以被视为一种特殊形式的关系分裂

答案 2 :(得分:0)

不,你没有正确回答问题。 “Union”为您提供了第一个查询(A)的所有结果以及第二个查询的所有结果,其中结果尚未返回。

虽然图形不错!

答案 3 :(得分:0)

好吧,我想我终于明白了。答案 - 带查询的746行 -

-- Customers who had an order on Mar or Apr, but not both
select Ord.CustomerID
from Sales.SalesOrderHeader as Ord
where (Ord.OrderDate >= '2003-03-01' AND Ord.OrderDate < '2003-04-01') -- all March
or (Ord.OrderDate >= '2003-04-01' AND Ord.OrderDate < '2003-05-01') -- all April

except 

select MarchAndApril.CustomerID
from
(
select Ord.CustomerID
from Sales.SalesOrderHeader as Ord
where (Ord.OrderDate >= '2003-03-01' AND Ord.OrderDate < '2003-04-01') -- March

intersect

select Ord.CustomerID
from Sales.SalesOrderHeader as Ord
where (Ord.OrderDate >= '2003-04-01' AND Ord.OrderDate < '2003-05-01') -- April
) as MarchAndApril

order by Ord.CustomerID

这是一个不同的示例数据集,可以简化操作。

表格 - 订单

列 - CustomerID(PK,int,非null),OrderDate(日期,非空) 仅在jan,feb,7月订购。

1   2012-01-01
1   2012-01-02
1   2012-02-01
2   2012-01-01
2   2012-02-01
3   2012-01-01
4   2012-02-01
5   2012-07-01

新问题 - 获得订单为jan或feb的客户,但不是两者。

策略 - 让客户获得jan和feb。然后,从该集合中删除在jan和feb上都有订单的客户。

我们预计结果为3,4。情况确实如此。

-- Customers who had an order on Jan or Feb, but not both
select Ord.CustomerID
from Orders as Ord
where (Ord.OrderDate >= '2012-01-01' AND Ord.OrderDate < '2012-02-01') -- all January
or (Ord.OrderDate >= '2012-02-01' AND Ord.OrderDate < '2012-03-01') -- all February

--We can replace this where + or by a UNION ??? I got the same results, ie 3,4

except 

select JanuaryAndFebruary.CustomerID
from
(
select Ord.CustomerID
from Orders as Ord
where (Ord.OrderDate >= '2012-01-01' AND Ord.OrderDate < '2012-02-01') -- January

intersect

select Ord.CustomerID
from Orders as Ord
where (Ord.OrderDate >= '2012-02-01' AND Ord.OrderDate < '2012-03-01') -- February
) as JanuaryAndFebruary

order by Ord.CustomerID