我继承了一些要清理的表,但首先,我试图连接所需的所有内容,但遇到了问题,因为通过SpecialEvents
到{ {1}}。
在某些情况下,EventRegistrations
可以使用EventRegistrations
直接联接,而在其他情况下,必须先联接另一个表event_registrations.scoreable_id
,您可以知道通过{{1 }},即SpecialPlaces
或event_registrations.scoreable_type
。
基本上,如果我还必须先加入SpecialEvent
,该如何加入SpecialPlace
?例如,如果我尝试以两种不同的方式加入SpecialEvents
,则会收到错误消息:“表名“ special_events”指定了多次”。
SpecialPlaces
SpecialEvents
SELECT event_registrations.id, array_agg(teams.name), event_registrations.number_of_players, event_registrations.state, event_registrations.created_at, array_agg(players.email), array_agg(special_events.name), array_agg(special_places.id)
FROM event_registrations
LEFT JOIN teams ON event_registrations.team_id = teams.id
LEFT JOIN team_memberships ON teams.id = team_memberships.team_id
LEFT JOIN players ON team_memberships.player_id = players.id
LEFT JOIN special_events ON event_registrations.scoreable_id = special_events.id AND event_registrations.scoreable_type = 'SpecialEvent'
LEFT JOIN special_places ON event_registrations.scoreable_id = special_places.id AND event_registrations.scoreable_type = 'SpecialPlace'
GROUP BY event_registrations.id, event_registrations.number_of_players, event_registrations.state, event_registrations.created_at
+----+-----------+---------------------------+-----------+---------------------------+
| id | region_id | start_at | state | created_at |
+----+-----------+---------------------------+-----------+---------------------------+
| 2 | 1 | 2015-10-22 19:30:00 +0100 | published | 2015-09-21 09:41:05 +0100 |
| 4 | 1 | 2016-01-21 19:30:00 +0000 | published | 2015-11-26 15:11:25 +0000 |
| 3 | 1 | 2016-01-28 19:30:00 +0000 | published | 2015-11-23 16:16:27 +0000 |
| 5 | 1 | 2016-12-31 19:30:00 +0000 | draft | 2016-02-24 15:17:22 +0000 |
| 6 | 1 | 2016-05-16 19:30:00 +0100 | published | 2016-03-29 14:33:40 +0100 |
| 10 | 1 | 2016-09-12 19:30:00 +0100 | published | 2016-06-28 17:18:54 +0100 |
| 8 | 1 | 2016-10-07 19:30:00 +0100 | draft | 2016-06-09 15:03:36 +0100 |
| 7 | 1 | 2016-05-23 19:30:00 +0100 | published | 2016-03-30 19:30:21 +0100 |
| 9 | 1 | 2016-08-04 19:30:00 +0100 | published | 2016-06-09 15:18:56 +0100 |
| 11 | 1 | 2016-11-07 19:30:00 +0000 | draft | 2016-07-11 17:20:11 +0100 |
+----+-----------+---------------------------+-----------+---------------------------+
答案 0 :(得分:6)
我的同事想说的是你想做的方式是不可行的,但是,做同一件事的方法有很多。
要避免两次联接,您将要做的是创建一个包含SpecialEvents和SpecialPlaces的组合表,其中包含您想要的所有信息,然后进行联接。
例如这样的东西:
SELECT event_registrations.id, array_agg(teams.name), event_registrations.number_of_players, event_registrations.state, event_registrations.created_at, array_agg(players.email), array_agg(special_events.name), array_agg(special_places.id)
FROM event_registrations
LEFT JOIN teams ON event_registrations.team_id = teams.id
LEFT JOIN team_memberships ON teams.id = team_memberships.team_id
LEFT JOIN players ON team_memberships.player_id = players.id
LEFT JOIN special_places ON event_registrations.scoreable_id = special_places.id AND event_registrations.scoreable_type = 'SpecialPlace'
LEFT JOIN (
SELECT special_events.id AS special_event_id, special_places.id AS special_place_id, special_events.name
FROM special_places
LEFT JOIN special_events ON special_places.special_event_id = special_events.id
UNION
SELECT special_events.id AS special_event_id, null AS special_place_id, special_events.name
FROM special_events
) el1
ON (event_registrations.scoreable_id = el1.special_place_id AND event_registrations.scoreable_type = 'SpecialPlace') OR (event_registrations.scoreable_id = el1.special_event_id AND event_registrations.scoreable_type = 'SpecialEvent')
GROUP BY event_registrations.id, event_registrations.number_of_players, event_registrations.state, event_registrations.created_at
答案 1 :(得分:4)
假设,并且根据一些有根据的猜测,id
是每个给定表中的PRIMARY KEY
列:
SELECT er.id
, t.name AS team_name -- can only be 1, no array_agg
, er.number_of_players
, er.state
, er.created_at
, tp.player_emails -- pre-aggregated!
, se.name AS special_event_name -- can only be 1, no array_agg
, sp.id AS special_pace_id -- can only be 1, no array_agg
FROM event_registrations er
LEFT JOIN teams t ON t.id = er.team_id
LEFT JOIN (
SELECT tm.team_id, array_agg(p.email) AS player_emails
FROM team_memberships tm
JOIN players p ON p.id = tm.player_id
GROUP BY 1
) tp USING (team_id)
LEFT JOIN special_places sp ON sp.id = er.scoreable_id AND er.scoreable_type = 'SpecialPlace'
LEFT JOIN special_events se ON se.id = er.scoreable_id AND er.scoreable_type = 'SpecialEvent'
OR se.id = sp.special_event_id AND er.scoreable_type = 'SpecialPlace'
很多 更简单,更快。
如果您确实确实需要两次连接到同一张表,则必须使用表别名,例如:
FROM event_registrations er
它的缩写:
FROM event_registrations AS er
结果是,您不需要需要两次加入同一张表。仍然使用表别名来降低噪音。相关:
外部GROUP BY
中全局SELECT
的唯一可识别原因是对team_memberships
的联接可能会增加行数。我将player_emails
的聚合移到了便宜得多的子查询中,删除了外部GROUP BY
并简化了其余的查询。还应该大大加快。相关:
如果 ,您需要在外部查询中使用GROUP BY
-并且event_registrations.id
的确是PRIMARY KEY
-然后, :
GROUP BY er.id, er.number_of_players, er.state, er.created_at
...只是另一种嘈杂的说法:
GROUP BY er.id
自Postgres 9.1起,PK覆盖GROUP BY
子句中表的所有列。参见:
但是您根本不需要。
最后,通过首先有条件地加入special_places
,然后有条件地再次加入special_events
来解决核心问题。缺少的列用NULL值填充:
LEFT JOIN special_places sp ON sp.id = er.scoreable_id AND er.scoreable_type = 'SpecialPlace'
LEFT JOIN special_events se ON se.id = er.scoreable_id AND er.scoreable_type = 'SpecialEvent'
OR se.id = sp.special_event_id AND er.scoreable_type = 'SpecialPlace'
严格来说,最后一个AND er.scoreable_type = 'SpecialPlace'
是多余的,因为否则就没有sp.special_event_id
了。为了清楚起见,我保留了它。
答案 2 :(得分:0)
从数学上来说,顺序不会影响您的结果(它会影响效率)。
已经说过,许多RDBMS实现(Postgres)具有选择成本最低的连接顺序的功能。
如果您要强制执行特定的加入顺序(即使它给出的答案相同),也可以尝试使用方括号。即使这样,我也不确定查询优化器是否不会重写查询树来优化性能-更改连接顺序。