我想在一个查询中创建多个Z分数[(value-AVG)/ STD]并过滤它们。
到目前为止,我想出了这个;然而,在你看之前,有一些法语单词,“quartier”就像是一个区/自治市,“prix”是一个价格。
SELECT ((t1.prix-A1.HMEAN)*100/A1.STD_SAMPLE) AS Z,
q.myId AS q_myId
FROM immobilier_ad_blank AS t1
LEFT JOIN Adresse AS a ON t1.adresse_id=a.id
LEFT JOIN Quartier AS q ON a.quartier_id=q.id
CROSS JOIN (
SELECT AVG(t2.prix) AS HMEAN,
STD(t2.prix) AS STD_SAMPLE,
q.myId AS quartier_myId
FROM immobilier_ad_blank AS t2
LEFT JOIN Adresse AS a ON t2.adresse_id=a.id
LEFT JOIN Quartier AS q ON a.quartier_id=q.id
GROUP BY quartier_myId
) AS A1 ON q.myId = A1.quartier_myId
它似乎有效,但我不确定我是否正确行事,我只添加了一个过滤条件,但我将添加最多6个条件,到目前为止,这个方法看起来非常麻烦。
最终看起来像这样
SELECT
((t1.prix-A1.HMEAN)*100/A1.STD_SAMPLE) AS Z1,
((t1.prix-A2.HMEAN)*100/A1.STD_SAMPLE) AS Z2,
((t1.prix-AN.HMEAN)*100/A1.STD_SAMPLE) AS ZN,
All Relevant selects
FROM immobilier_ad_blank AS t1
All Relevant LEFT Joins
CROSS JOIN (
SELECT AVG(t2.prix) AS HMEAN,
STD(t2.prix) AS STD_SAMPLE,
q.myId AS quartier_myId
FROM immobilier_ad_blank AS t2
LEFT JOIN Adresse AS a ON t2.adresse_id=a.id
LEFT JOIN Quartier AS q ON a.quartier_id=q.id
GROUP BY quartier_myId
) AS A1 ON q.myId = A1.quartier_myId
CROSS JOIN (
SELECT AVG(t2.prix) AS HMEAN,
STD(t2.prix) AS STD_SAMPLE,
q.myId AS quartier_myId
FROM immobilier_ad_blank AS t3
Other Joins
Group By Other conditions
) AS A2 ON OtherConditions
CROSS JOIN (
SELECT AVG(tN.prix) AS HMEAN,
STD(tN.prix) AS STD_SAMPLE,
q.myId AS quartier_myId
FROM immobilier_ad_blank AS tN
Other Joins
Group By Other conditions
) AS AN ON OtherConditions
这是在mysql中,根据手工交叉连接=内连接=连接。但是根据我的理解,交叉连接是表X表,但在我的情况下,因为它返回聚合,我想它不是t ^ 2而是t *(聚合数)对吗?
我正确地走在正确的轨道上?优化明智?我把过滤器放在了正确的位置吗?我基本上想要选择所有行,找到它们的连接值的平均值函数,计算它们的Z分数,并在不到5秒的时间内将它们加起来并且有大约1M的数据。
编辑1:
我会简化我的问题,
SELECT
((t1.prix-A1.HMEAN)*100/A1.STD_SAMPLE) AS Z,
((t1.prix-A2.HMEAN)*100/A2.STD_SAMPLE) AS Z2,
q.myId AS q_myId,
c.myId AS c_myId,
s.myId as size_myId
FROM immobilier_ad_blank AS t1
LEFT JOIN Size AS s ON t1.size_id=s.id
LEFT JOIN Adresse AS a ON t1.adresse_id=a.id
LEFT JOIN City AS c ON a.city_id=c.id
LEFT JOIN Quartier AS q ON a.quartier_id=q.id
CROSS JOIN (
SELECT AVG(t2.prix) AS HMEAN,
STD(t2.prix) AS STD_SAMPLE,
q.myId AS quartier_myId
s.myId as size_myId
FROM immobilier_ad_blank AS t2
LEFT JOIN Size AS s ON t2.size_id=s.id
LEFT JOIN Adresse AS a ON t2.adresse_id=a.id
LEFT JOIN Quartier AS q ON a.quartier_id=q.id
GROUP BY quartier_myId, size_myId
) AS A1 ON q.myId = A1.quartier_myId
AS A2 on s.myId=A2.size_myId #<--------- Is this line possible ?
编辑3:
我最终使用临时表并复制它们,因为我的情况下的视图比临时表慢,即使基础数据被正确编入索引。
CREATE TEMPORARY TABLE IF NOT EXISTS A1 AS (
SELECT AVG(t2.prix) AS HMEAN,
STD(t2.prix) AS STD_SAMPLE,
s.myId AS size_myId,
q.myId AS quartier_myId
FROM immobilier_ad_blank AS t2
LEFT JOIN Size AS s ON t2.size_id=s.id
LEFT JOIN Adresse AS a ON t2.adresse_id=a.id
LEFT JOIN Quartier AS q ON a.quartier_id=q.id
GROUP BY quartier_myId,size_myId);
CREATE TEMPORARY TABLE A2 LIKE A1;
INSERT A2 SELECT * FROM A1;
SELECT
((c.prix-A1.HMEAN)*100/A1.STD_SAMPLE) AS Z1,
((c.prix-A2.HMEAN)*100/A2.STD_SAMPLE) AS Z2,
q.myId AS quartier_myId,
s.myId AS size_myId
FROM immobilier_ad_blank AS c
LEFT JOIN Size AS s ON c.size_id=s.id
LEFT JOIN Adresse AS ad ON c.adresse_id=ad.id
LEFT JOIN Quartier AS q ON ad.quartier_id=q.id
JOIN A1 on A1.quartier_myId = q.myId
JOIN A2 AS A2 on A2.size_myId = s.myId
基本上,加入相同的虚拟表(子查询)=== Views ||临时表 说实话,我最终可能会创建一个永久表并每天更新一次....
答案 0 :(得分:2)
CROSS JOIN ... ON condition
使交叉联接成为普通的INNER JOIN
,或者为简洁起见,只需JOIN
。所以不要担心组合爆炸。正确编写的ON
条件会阻止它。简而言之,忘记CROSS JOIN
。只需写下JOIN
。
您可以将JOIN
写成类似这样的内容
FROM table_a AS a
JOIN table_b AS b ON a.id = b.id
或者,您可以将table_a或table_b或两者表示为子查询。例如,您可以编写此查询。
FROM table_A AS a
JOIN (
SELECT MAX(id) id, district
FROM table_b
GROUP BY district
) AS b on a.id = b.id
换句话说,您可以互换地使用物理表(table_b
)或虚拟表(子查询)。
在您的情况下,您的子查询是一个四列表:(HMEAN,STD_SAMPLE,quartier_myId和size_myId)。
如何将该表加入查询中的其余表?
你有代码
JOIN (subquery) AS A1 ON q.myId = A1.quartier_myId
但是,要使用子查询的第四列(size_myId),还需要将它用于JOIN。为此,请在AND
子句中加上ON
。做这样的事情:
JOIN (subquery) AS A1 ON q.myId = A1.quartier_myId
AND s.myId = A.size_myId
如果s.myId的含义是正确的,那应该会给你一个有用的结果。
复合ON
子句非常有用。你可以这样做:
LEFT JOIN (subquery) AS A1 ON q.myId = A1.quartier_myId
AND s.myId = A.size_myId
AND A.HMEAN > 7.5
过滤结果。