从单个表检索具有不同值的同一列的多个输出时的性能问题

时间:2012-11-08 00:16:38

标签: sql sql-server performance tsql

无论如何都可以从查询中获得以下结果而不连接同一个表三次(或)而不读取相同的“wordlocation”表三次(或者如果有更多单词则更多)?如果有三个或更多单词,则返回结果大约需要一分钟。

目前“wordlocation”表有三行(“bookid”,“wordid”,“location”),目前有917802行。

我想做的是

  1. 通过“wordid”检索包含查询中指定的所有单词的“bookid”。
  2. 每本书中所有单词(来自查询)的字数统计
  3. 每个单词位置的最小值,例如(min(w0.location),min(w1.location)
  4. 我已经尝试评估count(w0.wordid)和min(位置)计算,看看它们是否影响了性能,但事实并非如此。多次加入同一张表就是这种情况。

    enter image description here

    (这与上图相同)

    select 
        w0.bookid, 
        count(w0.wordid) as wcount, 
        abs(min(w0.location) + min(w1.location) + min(w2.location)) as wordlocation, 
        (abs(min(w0.location) - min(w1.location)) + abs(min(w1.location) - min(w2.location))) as distance 
        from 
        wordlocation as w0 
        inner join 
        wordlocation as w1 on w0.bookid = w1.bookid 
        join 
        wordlocation as w2 on w1.bookid = w2.bookid 
        where 
        w0.wordid =3 
        and 
        w1.wordid =52 
        and 
        w2.wordid =42
        group by w0.bookid 
        order by wcount desc;
    

    这是我正在寻找的结果,也是我从运行上述查询得到的结果,但是如果我指定超过3个单词,则需要很长时间,例如(w0 = 3,w1 = 52,w2 = 42,w3 = 71)

    enter image description here

1 个答案:

答案 0 :(得分:0)

尝试此查询

    SELECT bookid,
      ABS(L3+L52+L42) as wordlocation,
      ABS(L3-L52)+ABS(L52-L42) as distance
    FROM 
      (SELECT bookid, wordid, CASE WHEN wordid=3 THEN min(location) ELSE 0 END L3,
         CASE WHEN wordid=52 THEN min(location) ELSE 0 END L52,
         CASE WHEN wordid=42 THEN min(location) ELSE 0 END L42
      FROM wordlocation WL
      WHERE wordid in (3,52,42)
      GROUP BY bookid, wordid) T
    GROUP BY bookid

您可能还需要在wordid

上创建索引