我在SQLite中有一个EAV表data
,列source_id
,parameter_id
和value
。该表有几百万行,我决定使用EAV模型,因为有几百个可能的参数。
我有一个特定的查询,我需要从所有七个存在的source_id
同时获得七个不同的参数值。为简单起见,我会说这些是parameter_id
1-7。我使用以下查询:
SELECT
source_id,
data1.value,
data2.value,
data3.value,
data4.value,
data5.value,
data6.value,
data7.value
FROM
data AS data1
JOIN data AS data2 ON
data1.source_id=data2.source_id
AND data2.parameter_id=2
JOIN data AS data3 ON
data1.source_id=data3.source_id
AND data3.parameter_id=3
JOIN data AS data4 ON
data1.source_id=data4.source_id
AND data4.parameter_id=4
JOIN data AS data5 ON
data1.source_id=data5.source_id
AND data5.parameter_id=5
JOIN data AS data6 ON
data1.source_id=data6.source_id
AND data6.parameter_id=6
JOIN data AS data7 ON
data1.source_id=data7.source_id
AND data7.parameter_id=7
WHERE data1.parameter_id=1;
但我想知道是否有更好的方法来做到这一点。我想也许做子查询更有效,比如
SELECT ...
FROM
(
SELECT
source_id,
value
FROM
data
WHERE parameter_id=1
) AS data1
JOIN (
SELECT
source_id,
value
FROM
data
WHERE parameter_id=2
) AS data2 ON
data1.source_id=data2.source_id
...
虽然这种格式更长,但子查询可能更有效,因为在执行JOIN之前它们会消除绝大多数行?
我读了SQLite documentation进行优化,并说JOIN是通过嵌套循环完成的。但后来它还说子查询无论如何都可以改为WHERE语句。
其中一个查询是否比另一个“更好”?还有另一种方法来实现这个更好的支点吗?我对SQL和数据库都很陌生,所以我还在学习很多东西,感谢任何帮助。我想,作为一个更高层次的问题,是否有更好的方法来设计我的数据库?我认为关系模型不是要走的路,因为我的大多数数据都有太多的参数,我需要大量的动态查询。
编辑:我应该注意到parameter_id
上的索引有很多帮助
答案 0 :(得分:1)
这看起来像一个普通的香草枢轴。
这个怎么样:
WITH
input(source_id,parameter_id,value) AS (
SELECT 1,1,0.051253445446491
UNION ALL SELECT 1,2,0.328549513826147
UNION ALL SELECT 1,3,0.006703516934067
UNION ALL SELECT 1,4,0.625361373415217
UNION ALL SELECT 1,5,0.790167507482693
UNION ALL SELECT 1,6,0.595345180947334
UNION ALL SELECT 1,7,0.974001209484413
UNION ALL SELECT 2,1,0.698550914647058
UNION ALL SELECT 2,2,0.731252062832937
UNION ALL SELECT 2,3,0.697219420224428
UNION ALL SELECT 2,4,0.157373823458329
UNION ALL SELECT 2,5,0.621023152489215
UNION ALL SELECT 2,6,0.18642258644104
UNION ALL SELECT 2,7,0.295151106081903
)
SELECT
source_id
, SUM(CASE parameter_id WHEN 1 THEN value END) AS value1
, SUM(CASE parameter_id WHEN 2 THEN value END) AS value2
, SUM(CASE parameter_id WHEN 3 THEN value END) AS value3
, SUM(CASE parameter_id WHEN 4 THEN value END) AS value4
, SUM(CASE parameter_id WHEN 5 THEN value END) AS value5
, SUM(CASE parameter_id WHEN 6 THEN value END) AS value6
, SUM(CASE parameter_id WHEN 7 THEN value END) AS value7
FROM input
GROUP BY
source_id
ORDER BY
source_id
;
结果将是:
source_id|value1 |value2 |value3 |value4 |value5 |value6 |value7
1|0.051253445446491|0.328549513826147|0.006703516934067|0.625361373415217|0.790167507482693|0.595345180947334|0.974001209484413
2|0.698550914647058|0.731252062832937|0.697219420224428|0.157373823458329|0.621023152489215|0.186422586441040|0.295151106081903
开心玩......
Marco the Sane