SQL Server查询以查找具有连接查询

时间:2017-02-22 20:54:49

标签: sql-server

我正在使用SQL Server数据库,其中包含表xy和映射表xy

表:x

x_id             date               text
-------------------------------------------
| 1  |  2017-02-22 20:40:30.617  |    txt1   |
| 2  |  2017-02-22 20:40:06.103  |    txt1   |
| 3  |  2017-02-22 20:28:21.393  |    txt2   |

表:XY

x_id   y_id 
-----------
| 1  |  3  
| 1  |  10  
| 2  |  3  
| 2  |  10
| 3  |  5  

我有一个带有x_id的表X,日期,文本和映射表xy,带有x_id和y_id。 我需要一个查询来查找x的哪些记录是重复的。当满足所有以下条件时,x的记录可被视为重复

  1. 两者都有相同的文字
  2. 日期应在5分钟的间隔内。
  3. 两者都应该具有相同的y_id'(在XY映射表中)
  4. 我能够编写一个查询来满足前两个条件(尽管有重复的数据)。但我无法编写查询来满足第三个条件,并在执行自联接时显示不同的数据。

1 个答案:

答案 0 :(得分:0)

是另一个样本,如果x在XY中没有关系数据,是否需要忽略条件3? 此示例将忽略条件3.

    DECLARE @x TABLE(x_id int,[date] datetime, text varchar(10))
    insert into @x values
     ( 1,'2017-02-22 20:40:30.617','txt1')
    ,( 2,'2017-02-22 20:40:06.103','txt1')
    ,( 3,'2017-02-22 20:28:21.393','txt2')
    ,( 4,'2017-02-22 20:28:21.393','txt3')
    ,( 5,'2017-02-22 20:28:21.394','txt3')
     DECLARE @xy TABLE(x_id int, y_id int)
     INSERT INTO @xy VALUES 
     ( 1,3 )
    ,( 1,10) 
    ,( 2,3 )
    ,( 2,10)
    ,( 3,5 );


    SELECT x.*,xy.* FROM @x AS x 
    INNER JOIN @x AS ox ON x.x_id!=ox.x_id AND x.text=ox.text AND ABS(DATEDIFF(MINUTE,x.date,ox.date))<=5
    OUTER APPLY(
       SELECT COUNT(0) AS totaly, SUM(CASE WHEN xy1.y_id+xy2.y_id IS NULL THEN 1 ELSE 0 END) AS NULLROW
       FROM  (SELECT y_id FROM @xy WHERE x_id=x.x_id) AS xy1 FULL JOIN (SELECT y_id FROM @xy WHERE x_id=ox.x_id) AS xy2 ON xy1.y_id=xy2.y_id
    ) AS xy
    WHERE (xy.totaly>0 and xy.NULLROW=0) OR (xy.totaly=0)
x_id        date                    text       totaly      NULLROW
----------- ----------------------- ---------- ----------- -----------
1           2017-02-22 20:40:30.617 txt1       2           0
2           2017-02-22 20:40:06.103 txt1       2           0
4           2017-02-22 20:28:21.393 txt3       0           NULL
5           2017-02-22 20:28:21.393 txt3       0           NULL