SQLite EAV旋转效率

时间:2017-02-24 23:59:26

标签: sql sqlite self-join

我在SQLite中有一个EAV表data,列source_idparameter_idvalue。该表有几百万行,我决定使用EAV模型,因为有几百个可能的参数。

我有一个特定的查询,我需要从所有七个存在的source_id同时获得七个不同的参数值。为简单起见,我会说这些是parameter_id 1-7。我使用以下查询:

SELECT 
  source_id, 
  data1.value, 
  data2.value, 
  data3.value, 
  data4.value, 
  data5.value, 
  data6.value, 
  data7.value
FROM 
  data AS data1
  JOIN data AS data2 ON 
    data1.source_id=data2.source_id 
    AND data2.parameter_id=2
  JOIN data AS data3 ON 
    data1.source_id=data3.source_id 
    AND data3.parameter_id=3
  JOIN data AS data4 ON 
    data1.source_id=data4.source_id 
    AND data4.parameter_id=4
  JOIN data AS data5 ON 
    data1.source_id=data5.source_id 
    AND data5.parameter_id=5
  JOIN data AS data6 ON 
    data1.source_id=data6.source_id 
    AND data6.parameter_id=6
  JOIN data AS data7 ON 
    data1.source_id=data7.source_id 
    AND data7.parameter_id=7
WHERE data1.parameter_id=1;

但我想知道是否有更好的方法来做到这一点。我想也许做子查询更有效,比如

SELECT ...
FROM
  (
    SELECT 
      source_id, 
      value
    FROM
      data
    WHERE parameter_id=1
  ) AS data1
  JOIN (
    SELECT 
      source_id, 
      value
    FROM
      data
    WHERE parameter_id=2
  ) AS data2 ON
    data1.source_id=data2.source_id
  ...

虽然这种格式更长,但子查询可能更有效,因为在执行JOIN之前它们会消除绝大多数行?

我读了SQLite documentation进行优化,并说JOIN是通过嵌套循环完成的。但后来它还说子查询无论如何都可以改为WHERE语句。

其中一个查询是否比另一个“更好”?还有另一种方法来实现这个更好的支点吗?我对SQL和数据库都很陌生,所以我还在学习很多东西,感谢任何帮助。我想,作为一个更高层次的问题,是否有更好的方法来设计我的数据库?我认为关系模型不是要走的路,因为我的大多数数据都有太多的参数,我需要大量的动态查询。

编辑:我应该注意到parameter_id上的索引有很多帮助

1 个答案:

答案 0 :(得分:1)

这看起来像一个普通的香草枢轴。

这个怎么样:

WITH
input(source_id,parameter_id,value) AS (
          SELECT 1,1,0.051253445446491
UNION ALL SELECT 1,2,0.328549513826147
UNION ALL SELECT 1,3,0.006703516934067
UNION ALL SELECT 1,4,0.625361373415217
UNION ALL SELECT 1,5,0.790167507482693
UNION ALL SELECT 1,6,0.595345180947334
UNION ALL SELECT 1,7,0.974001209484413
UNION ALL SELECT 2,1,0.698550914647058
UNION ALL SELECT 2,2,0.731252062832937
UNION ALL SELECT 2,3,0.697219420224428
UNION ALL SELECT 2,4,0.157373823458329
UNION ALL SELECT 2,5,0.621023152489215
UNION ALL SELECT 2,6,0.18642258644104
UNION ALL SELECT 2,7,0.295151106081903
)
SELECT
  source_id
, SUM(CASE parameter_id WHEN 1 THEN value END) AS value1
, SUM(CASE parameter_id WHEN 2 THEN value END) AS value2
, SUM(CASE parameter_id WHEN 3 THEN value END) AS value3
, SUM(CASE parameter_id WHEN 4 THEN value END) AS value4
, SUM(CASE parameter_id WHEN 5 THEN value END) AS value5
, SUM(CASE parameter_id WHEN 6 THEN value END) AS value6
, SUM(CASE parameter_id WHEN 7 THEN value END) AS value7
FROM input
GROUP BY
  source_id
ORDER BY
  source_id
;

结果将是:

source_id|value1           |value2           |value3           |value4           |value5           |value6           |value7
        1|0.051253445446491|0.328549513826147|0.006703516934067|0.625361373415217|0.790167507482693|0.595345180947334|0.974001209484413
        2|0.698550914647058|0.731252062832937|0.697219420224428|0.157373823458329|0.621023152489215|0.186422586441040|0.295151106081903

开心玩......

Marco the Sane