优化多个嵌套选择的SQL查询

时间:2015-05-28 07:05:16

标签: sql sqlite subquery query-optimization

我需要在表格中拆分一个列,用以逗号分隔值到新视图或表格的不同列中。
目前,最适合我的解决方案是

CREATE VIEW clearTable as select ID,Timestamp,s1,s3,s4,s5,s6,s7,s8,substr(s8r,1,instr(s8r,",")-1) as s9 from
    (select ID,Timestamp,s1,s3,s4,s5,s6,s7,substr(s7r,1,instr(s7r,",")-1) as s8,substr(s7r,instr(s7r,",")+1) as s8r from
        (select ID,Timestamp,s1,s3,s4,s5,s6,substr(s6r,1,instr(s6r,",")-1) as s7,substr(s6r,instr(s6r,",")+1) as s7r from
            (select ID,Timestamp,s1,s3,s4,s5,substr(s5r,1,instr(s5r,",")-1) as s6,substr(s5r,instr(s5r,",")+1) as s6r from
                (select ID,Timestamp,s1,s3,s4,substr(s4r,1,instr(s4r,",")-1) as s5,substr(s4r,instr(s4r,",")+1) as s5r from
                    (select ID,Timestamp,s1,s3,substr(s3r,1,instr(s3r,",")-1) as s4,substr(s3r,instr(s3r,",")+1) as s4r from
                        (select ID,Timestamp,s1,substr(s2r,1,instr(s2r,",")-1) as s3,substr(s2r,instr(s2r,",")+1) as s3r from
                            (select ID,Timestamp,s1,substr(s1r,1,instr(s1r,",")-1) as s2,substr(s1r,instr(s1r,",")+1) as s2r from
                                (select ID,Timestamp,cast(substr(payload,1,instr(payload,",")-1) as TIME) as s1,substr(payload,instr(payload,",")+1) as s1r from thebasetable))))))))

如你所见 - 对于每个分离 - char一个新的子查询级别 结果是,我不会,但我正在寻找更好的方法来实现目标 - 也许是一种更有效的解决方案。
作为一个工作示例,您可以使用this SQL Fiddle 此外,我想提及的是,目前数据存储在SQLite内但可能会更改,因此无需针对SQLite进行优化。 欢迎提示所有提示。

1 个答案:

答案 0 :(得分:1)

让我先从您当前解决方案中的错误开始(除了效率与否之外):对于第三行,它会在0列中返回s1。据我了解你的意图,你想从有效载荷返回第一个元素,第3行是A,而不是0

它也不会返回s2 - 我不知道它是故意的还是不是。我的解决方案确实返回了它。

现在,回答你的问题我已经制定了一个运行速度更快的查询(在我的本地sqlite上测试它给了我3ms,而运行原始查询平均需要11ms)不嵌套选择这么多。这有点复杂,所以我会在事后解释。这是查询:

SELECT id,
       timestamp,
       max(CASE WHEN col = 1 THEN item ELSE '' END) AS s1,
       max(CASE WHEN col = 2 THEN item ELSE '' END) AS s2,
       max(CASE WHEN col = 3 THEN item ELSE '' END) AS s3,
       max(CASE WHEN col = 4 THEN item ELSE '' END) AS s4,
       max(CASE WHEN col = 5 THEN item ELSE '' END) AS s5,
       max(CASE WHEN col = 6 THEN item ELSE '' END) AS s6,
       max(CASE WHEN col = 7 THEN item ELSE '' END) AS s7,
       max(CASE WHEN col = 8 THEN item ELSE '' END) AS s8,
       max(CASE WHEN col = 9 THEN item ELSE '' END) AS s9
  FROM (
       WITH RECURSIVE tmp (
               id,
               timestamp,
               item,
               data,
               col
           )
           AS (
               SELECT id,
                      timestamp,
                      substr(payload, 1, instr(payload, ',') - 1),
                      payload,
                      1
                 FROM thebasetable
               UNION ALL
               SELECT id,
                      timestamp,
                      substr(substr(data, instr(data, ',') + 1), 1, instr(substr(data, instr(data, ',') + 1), ',') - 1),
                      substr(data, instr(data, ',') + 1),
                      col + 1
                 FROM tmp
                WHERE instr(data, ',') > 0 AND 
                      col < 9
                ORDER BY 1
           )
           SELECT id,
                  timestamp,
                  item,
                  col
             FROM tmp
       )
 GROUP BY id,
          timestamp;

查询使用公用表表达式(CTE)。您可以在SQLite的SQL语法文档中阅读更多相关信息(查找WITH语句)。

CTE部分就是这个:

   WITH RECURSIVE tmp (
           id,
           timestamp,
           item,
           data,
           col
       )
       AS (
           SELECT id,
                  timestamp,
                  substr(payload, 1, instr(payload, ',') - 1),
                  payload,
                  1
             FROM thebasetable
           UNION ALL
           SELECT id,
                  timestamp,
                  substr(substr(data, instr(data, ',') + 1), 1, instr(substr(data, instr(data, ',') + 1), ',') - 1),
                  substr(data, instr(data, ',') + 1),
                  col + 1
             FROM tmp
            WHERE instr(data, ',') > 0 AND 
                  col < 9
            ORDER BY 1
       )
       SELECT id,
              timestamp,
              item,
              col
         FROM tmp

它的作用是读取具有初始有效负载的所有行,从有效负载获取“第一”元素,并为其添加等于col的{​​{1}}值。然后它将有效负载传递给CTE的下一次迭代,但是它会从有效负载中切断第一个元素,因此下一次迭代会看到第一个元素。它还会为每次下一次迭代增加初始1值。

循环遍历整个有效负载,每次迭代移动第一个元素,直到达到有效负载的末尾(1)。

我还向WHERE instr(data, ',') > 0添加了第二个条件:WHERE - 这个条件控制从有效负载中提取的列数。该数字应该等于您将读取的列数。如果将其设置为较小的数字,则结果中的其余列将为空。如果你将它设置为更大的数字,它将不会造成任何伤害,除了查询将不必要地慢一点。

最后,CTE包含在col < 9中,SELECTID将CTE的结果分组,然后通过检测是否有任何值来获取其余列的值行,或不。很难解释。如果您自己执行CTE部分会更好,看看它返回的内容,那么您将了解外部Timestamp的作用。

注意 - 此解决方案需要SQLite 3.8.3,因为这是将CTE引入SQLite时的版本。

CTE是数据库中的常见功能。它得到了大多数流行数据库的支持(我只是抬头看,它出现在MySQL,MS SQL,Oracle,PostgreSQL中,所以看起来很不错)。