Question

此查询需要花费太多时间，因此我尝试对其进行优化。你有什么想法或建议吗？

我在一个程序和一个while循环中尝试使用全文...它变得最糟糕（dbo.url有超过100 000行; dbo.url其中status ='tocheck'只有1000）

select tocheck.*
from dbo.url tocheck inner join dbo.url done 
on tocheck.id != done.id 
and tocheck.url like done.url+'%' 
and done.status in ('tocheck','todo','done') 
where tocheck.status = 'tocheck'

编辑：

我使用不同的网址多次调用网络服务：网址看起来像http://ws.com/query?p1=a&p2=b（url1）。

如果我已经调用了url http://ws.com/query?p1=a（url2），我不想调用url1原因：

url1 like url2+'%'

感谢您的帮助。

Edit2：

我添加了一个包含每个网址的'query？p1 = a'的列郊区，并修改了查询：

select tocheck.*
from dbo.url tocheck inner join dbo.url done 
on tocheck.id != done.id 
and tocheck.suburl = done.suburl --NEW
and tocheck.url like done.url+'%' 
and done.status in ('tocheck','todo','done') 
where tocheck.status = 'tocheck'

短了10多倍...... P !!

Answer 1

我认为因为通过ID不能将表连接到自身，所以有很多开销，因为这是一个笛卡尔积，只排除相同id的自联接。

我建议尝试使用子查询。然后外部查询只返回1000（如你所提到的）tochecks而子查询另外排除以相同字符开头的url：

select
   tocheck.*
from
   dbo.url tocheck
where
   tocheck.status = 'tocheck'
and
    tocheck.id not in (
        select
            done.id
        from
            dbo.url done
        where
            tocheck.url like done.url+'%'
        and
            done.status in ('tocheck','todo','done')
    )

sql server 2012：如何像％查询一样优化这个

1 个答案: