我有一张订单表,我知道有重复的
customer order_number order_date
---------- ------------ -------------------
1 1 2012-03-01 01:58:00
1 2 2012-03-01 02:01:00
1 3 2012-03-01 02:03:00
2 4 2012-03-01 02:15:00
3 5 2012-03-01 02:18:00
3 6 2012-03-01 04:30:00
4 7 2012-03-01 04:35:00
5 8 2012-03-01 04:38:00
6 9 2012-03-01 04:58:00
6 10 2012-03-01 04:59:00
我想找到所有重复项(由彼此在60分钟内由同一客户订购)。结果集由“重复”行组成,或者是一组具有重复数量的所有客户。
这是我试过的
SELECT
customer,
count(*)
FROM
orders
GROUP BY
customer,
DATEPART(HOUR, order_date)
HAVING (count(*) > 1)
当副本在彼此的60分钟内但在不同的时间内,即1:58和2:02时,这不起作用
我也试过这个
SELECT
o1.customer,
o1.order_number,
o2.order_number,
DATEDIFF(MINUTE,o1.order_date, o2.order_date) AS [diff]
FROM
orders o1 LEFT OUTER JOIN
orders o2 ON o1.customer = o2.customer AND o1.order_number <> o2.order_number
WHERE
ABS(DATEDIFF(MINUTE,o1.order_date, o2.order_date)) < 60
现在这给了我所有重复项,但它也为每个重复订单提供了多行。即(o1,o2)和(o2,o1),如果没有多个重复的订单,那就不会那么糟糕。在那些情况下,我得到(o1,o2),(o1,o3),(o2,o1),(o2,o3),(o3,o1),(o3,o2)等。我得到了所有的排列。< / p>
任何人都有一些见解?我不一定在这里寻找表现最好的答案,只有一个有效。
答案 0 :(得分:3)
SELECT
*,
CASE WHEN EXISTS (SELECT *
FROM orders AS lookup
WHERE customer = orders.customer
AND order_date < orders.order_date
AND order_date >= DATEADD(hour, -1, order_date)
)
THEN 'Principle Order'
ELSE 'Duplicate Order'
END as Order_Status
FROM
orders
使用EXISTS
和相关的子查询,您可以检查过去一小时内是否有任何先前的订单。
答案 1 :(得分:1)
也许是这样的:
测试数据:
DECLARE @tbl TABLE(customer INT,order_number INT,order_date DATETIME)
INSERT INTO @tbl
VALUES
(1,1,'2012-03-01 01:58:00'),
(1,2,'2012-03-01 02:01:00'),
(1,3,'2012-03-01 02:03:00'),
(2,4,'2012-03-01 02:15:00'),
(3,5,'2012-03-01 02:18:00'),
(3,6,'2012-03-01 04:30:00'),
(4,7,'2012-03-01 04:35:00'),
(5,8,'2012-03-01 04:38:00'),
(6,9,'2012-03-01 04:58:00'),
(6,10,'2012-03-01 04:59:00')
<强>查询强>
;WITH CTE
AS
(
SELECT
MIN(datediff(minute,'1990-1-1',order_date)) OVER(PARTITION BY customer) AS minDate,
datediff(minute,'1990-1-1',order_date) AS DateTicks,
tbl.customer
FROM
@tbl AS tbl
)
SELECT
CTE.customer,
SUM(CASE WHEN (CTE.DateTicks-CTE.minDate)<60 THEN 1 ELSE 0 END)
FROM
CTE
GROUP BY
CTE.customer
答案 2 :(得分:1)
以下查询确定了彼此间隔60分钟内所有可能的订单排列:
DECLARE @orders TABLE (CustomerId INT, OrderId INT, OrderDate DATETIME)
INSERT INTO @orders
VALUES
(1, 1, '2012-03-01 01:58:00'),
(1, 2, '2012-03-01 02:01:00'),
(1, 3, '2012-03-01 02:03:00'),
(2, 4, '2012-03-01 02:15:00'),
(3, 5, '2012-03-01 02:18:00'),
(3, 6, '2012-03-01 04:30:00'),
(4, 7, '2012-03-01 04:35:00'),
(5, 8, '2012-03-01 04:38:00'),
(6, 9, '2012-03-01 04:58:00'),
(6, 10, '2012-03-01 04:59:00');
with ProximityOrderCascade(CustomerId, OrderId, ProximateOrderId, MinutesDifference, OrderDate, ProximateOrderDate)
as
(
select o.customerid, o.orderid, null, null, o.orderdate, o.orderdate
from @orders o
union all
select o.customerid, o.orderid, p.orderid, datediff(minute, p.OrderDate, o.OrderDate), o.OrderDate, p.OrderDate
from ProximityOrderCascade p
inner join @orders o
on p.customerid = o.customerid
and abs(datediff(minute, p.OrderDate, o.OrderDate)) between 0 and 60
and o.orderid <> p.orderid
where proximateorderid is null
)
select * from ProximityOrderCascade
where
not ProximateOrderId is null
从那里,您可以将结果转换为您选择的查询。此功能的结果仅将客户1和6识别为具有“重复”订单。
CustomerId OrderId ProximateOrderId MinutesDifference OrderDate ProximateOrderDate
----------- ----------- ---------------- ----------------- ----------------------- -----------------------
6 9 10 -1 2012-03-01 04:58:00.000 2012-03-01 04:59:00.000
6 10 9 1 2012-03-01 04:59:00.000 2012-03-01 04:58:00.000
1 1 3 -5 2012-03-01 01:58:00.000 2012-03-01 02:03:00.000
1 2 3 -2 2012-03-01 02:01:00.000 2012-03-01 02:03:00.000
1 1 2 -3 2012-03-01 01:58:00.000 2012-03-01 02:01:00.000
1 3 2 2 2012-03-01 02:03:00.000 2012-03-01 02:01:00.000
1 2 1 3 2012-03-01 02:01:00.000 2012-03-01 01:58:00.000
1 3 1 5 2012-03-01 02:03:00.000 2012-03-01 01:58:00.000
(8 row(s) affected)