如何将多个自我连接查询转换为递归CTE

时间:2015-03-19 20:50:16

标签: sql join recursion inner-join common-table-expression

我希望有人可以向我解释这个问题真的很慢,因为从我读过的所有内容中我似乎都不明白如何做我想做的事。

我有一个包含IP列和电子邮件的表。我收到一封电子邮件,我需要查找相应的Ips,但它不会在那里结束,然后我需要查找那些新Ips的电子邮件,然后查找这些电子邮件的Ips等等,直到没有更多的新电子邮件和Ips。

我可以这样做:

SELECT DISTINCT t.CUSTOMER_IP, t.CUSTOMER_EMAIL
   FROM [Main table] t
          INNER JOIN [Main table] t1 ON (t.CUSTOMER_IP = t1.CUSTOMER_IP)
          INNER JOIN [Main table] t2 ON (t1.CUSTOMER_EMAIL = t2.CUSTOMER_EMAIL)
          INNER JOIN [Main table] t3 ON (t2.CUSTOMER_IP = t3.CUSTOMER_IP)
   WHERE t3.CUSTOMER_EMAIL = 'ejskslsks@gmail.com'
           AND t1.CUSTOMER_IP IS NOT NULL
        AND t2.CUSTOMER_IP IS NOT NULL
        AND t3.CUSTOMER_IP IS NOT NULL
        and t.ISSUE_DATE BETWEEN '2015-02-23 00:00:00' AND '2015-02-23 23:59:59' 

到目前为止一直很好,除了这限制了我的搜索,我需要创建一些递归查询,如:

WITH iptable as
(
SELECT DISTINCT CUSTOMER_IP, customer_email, 1 as loopy
       FROM [Main table]
       WHERE CUSTOMER_EMAIL = 'ejskslsks@gmail.com'
       AND ISSUE_DATE BETWEEN '2015-02-23 00:00:00' AND '2015-02-23 23:59:59'
       AND CUSTOMER_IP IS NOT NULL
union all
SELECT t.CUSTOMER_IP, t.CUSTOMER_EMAIL, iptable.loopy +1 as loopy
    FROM [Main table] t
              INNER JOIN iptable ON (iptable.CUSTOMER_IP = t.CUSTOMER_IP)
              INNER JOIN [Main table] t1 ON (t.CUSTOMER_EMAIL = t1.CUSTOMER_EMAIL)
    where t.ISSUE_DATE BETWEEN '2015-02-23 00:00:00' AND '2015-02-23 23:59:59' 
    and iptable.loopy <2

)
Select DISTINCT CUSTOMER_IP, CUSTOMER_EMAIL from iptable

loopy列只是为了能够控制迭代次数,在这个例子中只有2次。这只给了我新的电子邮件,因为它不会从这些电子邮件中寻找新的ips。

我不知道如何解决这个问题,我是SQL初学者。我需要提供其他信息吗?也许CTE不是最好的方法?我已经考虑过WHILE查询,但我必须使用临时表,如果可能的话我想避免使用它们。

提前谢谢!

1 个答案:

答案 0 :(得分:1)

在选择欠合并选择时,您需要一种包含父记录的方法,否则您将拥有大的中间结果集(这会对性能产生负面影响)。这绝对不是SQL的最佳选择。

此外 - 您可以加入多个条件(CUSTOMER_EMAIL或CUSTOMER_IP) - 从而避免多次加入同一个表。

我喜欢使用使用的select语句cte是限制结果的那个 - 并且只有cte中的关系特定的东西。他们是如此开始旋转。也许是这样的:

with iptable (ROOT_EMAIL, CUSTOMER_IP, CUSTOMER_EMAIL, ISSUE_DATE, DEPTH )as
(
    select top (1)
        parent.CUSTOMER_EMAIL ROOT_EMAIL,
        parent.CUSTOMER_IP,
        parent.CUSTOMER_EMAIL,
        parent.ISSUE_DATE, 
        1 as DEPTH
    from
        [Main table] parent
    where
        parent.CUSTOMER_IP is not null
    order by 
        parent.ISSUE_DATE

    union all

    select
        parent.ROOT_EMAIL,
        child.CUSTOMER_IP,
        child.CUSTOMER_EMAIL,
        child.ISSUE_DATE,
        parent.DEPTH + 1
    from
        [Main table] child
        inner join
        iptable parent
        on
            (
                child.CUSTOMER_EMAIL= parent.CUSTOMER_EMAIL
                or
                child.CUSTOMER_IP = parent.CUSTOMER_IP
            )
            and
            child.ISSUE_DATE > parent.ISSUE_DATE
)
select distinct
    CUSTOMER_EMAIL,
    CUSTOMER_IP
from 
    iptable
where
    ROOT_EMAIL ='ejskslsks@gmail.com'
    and
    ISSUE_DATE between '2015-02-23 00:00:00' and '2015-02-23 23:59:59'

我假设您期望给定电子邮件可能有多个IP - 并且可能是给定IP的多封电子邮件。