我正在使用SQL Server数据库,其中包含表x
,y
和映射表xy
。
表:x
x_id date text
-------------------------------------------
| 1 | 2017-02-22 20:40:30.617 | txt1 |
| 2 | 2017-02-22 20:40:06.103 | txt1 |
| 3 | 2017-02-22 20:28:21.393 | txt2 |
表:XY
x_id y_id
-----------
| 1 | 3
| 1 | 10
| 2 | 3
| 2 | 10
| 3 | 5
我有一个带有x_id的表X,日期,文本和映射表xy,带有x_id和y_id。 我需要一个查询来查找x的哪些记录是重复的。当满足所有以下条件时,x的记录可被视为重复
我能够编写一个查询来满足前两个条件(尽管有重复的数据)。但我无法编写查询来满足第三个条件,并在执行自联接时显示不同的数据。
答案 0 :(得分:0)
是另一个样本,如果x在XY中没有关系数据,是否需要忽略条件3? 此示例将忽略条件3.
DECLARE @x TABLE(x_id int,[date] datetime, text varchar(10))
insert into @x values
( 1,'2017-02-22 20:40:30.617','txt1')
,( 2,'2017-02-22 20:40:06.103','txt1')
,( 3,'2017-02-22 20:28:21.393','txt2')
,( 4,'2017-02-22 20:28:21.393','txt3')
,( 5,'2017-02-22 20:28:21.394','txt3')
DECLARE @xy TABLE(x_id int, y_id int)
INSERT INTO @xy VALUES
( 1,3 )
,( 1,10)
,( 2,3 )
,( 2,10)
,( 3,5 );
SELECT x.*,xy.* FROM @x AS x
INNER JOIN @x AS ox ON x.x_id!=ox.x_id AND x.text=ox.text AND ABS(DATEDIFF(MINUTE,x.date,ox.date))<=5
OUTER APPLY(
SELECT COUNT(0) AS totaly, SUM(CASE WHEN xy1.y_id+xy2.y_id IS NULL THEN 1 ELSE 0 END) AS NULLROW
FROM (SELECT y_id FROM @xy WHERE x_id=x.x_id) AS xy1 FULL JOIN (SELECT y_id FROM @xy WHERE x_id=ox.x_id) AS xy2 ON xy1.y_id=xy2.y_id
) AS xy
WHERE (xy.totaly>0 and xy.NULLROW=0) OR (xy.totaly=0)
x_id date text totaly NULLROW ----------- ----------------------- ---------- ----------- ----------- 1 2017-02-22 20:40:30.617 txt1 2 0 2 2017-02-22 20:40:06.103 txt1 2 0 4 2017-02-22 20:28:21.393 txt3 0 NULL 5 2017-02-22 20:28:21.393 txt3 0 NULL