我试图在8个表之间建立连接,并且因为每个表有超过500,000个条目,所以它非常慢。我想知道,你有没有最好的方式加入这些表?
所有表都具有以下结构:
data_temprature:
+----+----------+-----+-----------+----------+
| ID_geo | NAME | Value | Date |
+--------+----------+-------+-----------------+
| 10005 | Madrid | 32 | 2017-06-12 08:00|
| 10005 | Madrid | 25 | 2017-06-12 09:00|
| 12701 | Paris | 23 | 2017-06-12 08:00|
| 13006 | Tokyo | 25 | 2017-06-12 11:00|
| 11132 | Sevilla | 27 | 2017-06-12 16:00|
| 21333 | London | 22 | 2017-06-12 17:00|
+--------+----------+-------+-----------------+
data_WeatherSimbol
+----+----------+-----+-----------+----------+
| ID_geo | NAME | Value | Date |
+--------+----------+-------+-----------------+
| 10005 | Madrid | A+ | 2017-06-12 08:00|
| 10005 | Madrid | A | 2017-06-12 09:00|
| 12701 | Paris | A- | 2017-06-12 08:00|
| 13006 | Tokyo | C- | 2017-06-12 11:00|
| 11132 | Sevilla | I+ | 2017-06-12 16:00|
| 21333 | London | D- | 2017-06-12 17:00|
+--------+----------+-------+-----------------+
我想加入以获得这个结果:
+----+----------+-----+-----------+----------+-----------------+
| ID_geo | NAME | Temperature | Simboles | Date |
+--------+----------+-------------+----------+-----------------+
| 10005 | Madrid | 32 | A+ | 2017-06-12 08:00|
| 10005 | Madrid | 25 | A | 2017-06-12 09:00|
| 12701 | Paris | 23 | A- | 2017-06-12 08:00|
| 13006 | Tokyo | 25 | C- | 2017-06-12 11:00|
| 11132 | Sevilla | 27 | I+ | 2017-06-12 16:00|
| 21333 | London | 22 | D- | 2017-06-12 17:00|
+--------+----------+-------------+----------+-----------------+
由于
提供更新实际数据:
执行计划: https://files.fm/u/b4besk27
这是查询:
SELECT
cielo.data_value AS cielo,
lluv.data_value AS lluvia,
temp.data_value AS temp,
vientos.data_value AS viento,
tmin.data_value AS tempmin,
tmax.data_value AS tempmax,
cielo.data_date AS DiaPrev
FROM
data_cielo AS cielo
INNER JOIN data_lluvia AS lluv ON cielo.data_geo = lluv.data_geo
INNER JOIN data_presion AS pres ON cielo.data_geo = pres.data_geo
INNER JOIN data_temp AS temp ON cielo.data_geo = temp.data_geo
LEFT JOIN data_tempmax AS tmax ON cielo.data_geo = tmax.data_geo
LEFT JOIN data_tempmin AS tmin ON cielo.data_geo = tmin.data_geo
INNER JOIN data_viento AS vientos ON cielo.data_geo = vientos.data_geo
WHERE
cielo.data_date = lluv.data_date
AND pres.data_date = cielo.data_date
AND vientos.data_date = pres.data_date
AND temp.data_date = vientos.data_date
AND cielo.data_geo = 46 ORDER BY cielo.data_date;
and this is the result:
E+ 0.0461028 29.6937088 S2 19.408 36.39 2017-06-13 12:00:00.000
E+ 0.0461028 29.6937088 S2 21.422 36.39 2017-06-13 12:00:00.000
E+ 0.0461028 29.6937088 S2 19.408 37.853 2017-06-13 12:00:00.000
E+ 0.0461028 29.6937088 S2 21.422 37.853 2017-06-13 12:00:00.000
E+ 0.0461028 30.7593854 S2 19.408 36.39 2017-06-13 13:00:00.000
E+ 0.0461028 30.7593854 S2 21.422 36.39 2017-06-13 13:00:00.000
E+ 0.0461028 30.7593854 S2 19.408 37.853 2017-06-13 13:00:00.000
E+ 0.0461028 30.7593854 S2 21.422 37.853 2017-06-13 13:00:00.000
A+ 0.0461028 31.6310774 SSW2 19.408 36.39 2017-06-13 14:00:00.000
A+ 0.0461028 31.6310774 SSW2 21.422 36.39 2017-06-13 14:00:00.000
A+ 0.0461028 31.6310774 SSW2 19.408 37.853 2017-06-13 14:00:00.000
A+ 0.0461028 31.6310774 SSW2 21.422 37.853 2017-06-13 14:00:00.000
A 0.0461028 32.2647927 S2 19.408 36.39 2017-06-13 15:00:00.000
A 0.0461028 32.2647927 S2 21.422 36.39 2017-06-13 15:00:00.000
A 0.0461028 32.2647927 S2 19.408 37.853 2017-06-13 15:00:00.000
它不应该像这样,我需要像我所说的每小时温度,压力,降水,天空......的数据值......
答案 0 :(得分:0)
试试这个
;With data_temprature(ID_geo,NAME,Value,[Date])
AS
(
SELECT 10005 , 'Madrid' , 32 , '2017-06-12 08:00' Union all
SELECT 10005 , 'Madrid' , 25 , '2017-06-12 09:00' Union all
SELECT 12701 , 'Paris' , 23 , '2017-06-12 08:00' Union all
SELECT 13006 , 'Tokyo' , 25 , '2017-06-12 11:00' Union all
SELECT 11132 , 'Sevilla' , 27 , '2017-06-12 16:00' Union all
SELECT 21333 , 'London' , 22 , '2017-06-12 17:00'
)
,data_WeatherSimbol(ID_geo,NAME,Value,[Date])
AS
(
SELECT 10005 , 'Madrid' , 'A+' , '2017-06-12 08:00' Union all
SELECT 10005 , 'Madrid' , 'A' , '2017-06-12 09:00' Union all
SELECT 12701 , 'Paris' , 'A-' , '2017-06-12 08:00' Union all
SELECT 13006 , 'Tokyo' , 'C-' , '2017-06-12 11:00' Union all
SELECT 11132 , 'Sevilla' , 'I+' , '2017-06-12 16:00' Union all
SELECT 21333 , 'London' , 'D-' , '2017-06-12 17:00'
)
SELECT ID_geo,
NAME,
Temperature,
Symboles,
[Date] From
(
SELECT t.ID_geo ,
t.NAME ,
t.Value AS Temperature,
w.Value AS Symboles,t.[Date] ,
ROW_NUMBER()OVER(PARTITION BY t.Value,t.[Date] ORDER BY t.[Date]) AS Rno
FROM data_temprature t
INNER join data_WeatherSimbol w
On t.ID_geo=w.ID_geo
)Dt
WHERE Dt.Rno=1
ORDER BY ID_geo
答案 1 :(得分:0)
我认为您可以加入地理位置和日期:
select t.*, ws.simboles
from data_temperature t join
data_WeatherSimbol ws
on t.ID_geo = ws.ID_geo and t.date = ws.date;
答案 2 :(得分:0)
[ID_geo]
和[Date]
似乎都不足以加入,所以:
为所有表格(如
)的两列创建索引 create index IX_data_temprature on data_temprature ([ID_geo], [Date])
按[ID_geo]
,[Date]
答案 3 :(得分:0)
查询的大部分负载都是由RID查找引起的。
当索引不覆盖查询时(Sql必须查找表中的值,因为它们未包含在索引中)并且索引是非群集的,因此使用RID查找。
如果使用覆盖索引,查询可能会更快,您可能没有在索引中包含值。有关包含的更多信息,请参阅Microsoft docs。
如果将非聚集索引更改为聚簇索引,也可能会有所帮助。