我最近在我的一个查询中添加了多对多 JOIN 以添加“标记”功能。多对多工作很好,但是,它现在导致查询的先前工作部分输出记录两次。
SELECT v.*
FROM "Server" AS s
JOIN "Vote" AS v ON (s.id = v."serverId")
JOIN "_ServerToTag" st ON (s.id = st."A")
OFFSET 0 LIMIT 25;
id | createdAt | authorId | serverId
-----+-------------------------+----------+----------
190 | 2020-12-23 15:47:25.476 | 6667 | 3
190 | 2020-12-23 15:47:25.476 | 6667 | 3
194 | 2020-12-21 15:47:25.476 | 6667 | 3
194 | 2020-12-21 15:47:25.476 | 6667 | 3
在上面的例子中:
Server
是我的主表,其中包含一堆条目。将其视为 Reddit 帖子,它们具有标题、内容并使用 Vote
表来计算“赞成票”。 id | title
----+-------------------------------
3 | test server 3
Votes
是一个非常简单的表,它包含“upvote”的时间戳、创建它的人以及分配给它的 Server.id
。_ServerToTag
是一个包含两列A
和B
的表。它将 Server
连接到另一个包含 Tags
的表。 A | B
---+---
3 | 1
3 | 2
以上是一个大大简化的查询,实际上,我正在sum
查询查询结果以获得 number
总票数。
期望的结果是结果不会重复:
id | createdAt | authorId | serverId
-----+-------------------------+----------+----------
190 | 2020-12-23 15:47:25.476 | 6667 | 3
194 | 2020-12-21 15:47:25.476 | 6667 | 3
我真的不确定为什么会发生这种情况,所以我完全不知道如何解决它。
任何帮助将不胜感激。
编辑:
如果我想查询DISTINCT
表,Vote
可以工作。但不是在更复杂的查询中。就我而言,它看起来更像这样:
SELECT s.id, s.title, sum(case WHEN v."createdAt" >= '2020-12-01' AND v."createdAt" < '2021-01-01'
THEN 1 ELSE 0 END ) AS "voteCount",
FROM "Server" AS s
LEFT JOIN "Vote" AS v ON (s.id = "serverId")
LEFT JOIN "_ServerToTag" st ON (s.id = st."A");
id | title | voteCount
----+-------------------------------+-----------
3 | test server 3 | 4
在上面,我只需要 voteCount
列是 DISTINCT。
SELECT s.id, s.title, sum(DISTINCT case WHEN v."createdAt" >= '2020-12-01' AND v."createdAt" < '2021-01-01'
THEN 1 ELSE 0 END ) AS "voteCount",
FROM "Server" AS s
LEFT JOIN "Vote" AS v ON (s.id = "serverId")
LEFT JOIN "_ServerToTag" st ON (s.id = st."A");
id | title | voteCount
----+-------------------------------+-----------
3 | test server 3 | 1
以上几种作品,但是好像有多个也只能算一票。
答案 0 :(得分:0)
问题似乎是您将联接添加到 _ServerToTag
。由于 _ServerToTag
中的每一行都有 Server
中的多行,因此查询为每个服务器返回多行,_ServerToTag
中的每个匹配行都返回一个行。
似乎 _ServerToTag
已添加到查询中,因此它将仅包含具有标签的服务器。如果这是您的意图,您可以使用:
SELECT v.id, v.authorId, v.serverId, COUNT(DISTINCT v.createdAt) AS TOTAL_VOTES
FROM "Server" AS s
INNER JOIN "Vote" AS v
ON s.id = v."serverId"
INNER JOIN (SELECT DISTINCT "A" FROM "_ServerToTag") st
ON s.id = st."A"
WHERE v."createdAt" >= '2020-12-01' AND
v."createdAt" < '2021-01-01'
GROUP BY v.id, v.authorId, v.serverId
OFFSET 0 LIMIT 25
或
SELECT v.id, v.authorId, v.serverId, COUNT(DISTINCT v.createdAt) AS TOTAL_VOTES
FROM "Server" AS s
INNER JOIN "Vote" AS v
ON s.id = v."serverId"
WHERE v."createdAt" >= '2020-12-01' AND
v."createdAt" < '2021-01-01' AND
s.id IN (SELECT "A" FROM "_ServerToTag")
GROUP BY v.id, v.authorId, v.serverId
OFFSET 0 LIMIT 25
这可能会更好地传达查询的意图。
如果您希望能够对没有投票的条目进行计数,您需要使用外连接来拉入(可能不存在的)投票,然后使用 CASE 表达式仅计算存在的投票:
SELECT s.id, v.id, v.authorId, v.serverId,
CASE
WHEN v.id IS NULL THEN 0
ELSE COUNT(DISTINCT v.createdAt)
END AS TOTAL_VOTES
FROM "Server" AS s
LEFT OUTER JOIN "Vote" AS v
ON s.id = v."serverId"
WHERE v."createdAt" >= '2020-12-01' AND
v."createdAt" < '2021-01-01' AND
s.id IN (SELECT "A" FROM "_ServerToTag")
GROUP BY s.id, v.id, v.authorId, v.serverId
OFFSET 0 LIMIT 25
您可能实际上并不需要它 - 您可能可以逃脱
SELECT s.id, v.id, v.authorId, v.serverId,
COUNT(DISTINCT v.createdAt) AS TOTAL_VOTES
FROM "Server" AS s
LEFT OUTER JOIN "Vote" AS v
ON s.id = v."serverId"
WHERE v."createdAt" >= '2020-12-01' AND
v."createdAt" < '2021-01-01' AND
s.id IN (SELECT "A" FROM "_ServerToTag")
GROUP BY s.id, v.id, v.authorId, v.serverId
OFFSET 0 LIMIT 25
答案 1 :(得分:0)
好吧,在我收到的答案无法真正解决我的问题后,我去找朋友寻求帮助。
我认为我的查询过于复杂和令人困惑,我被建议使用子查询来降低复杂性和易于管理。
我的查询现在看起来像这样:
SELECT
s.id
, s.title
, COALESCE(v."VOTES", 0) AS "voteCount"
FROM "Server" AS s
-- Join tags
INNER JOIN
(
SELECT
st."A"
, json_agg(
json_build_object(
'id',
t.id,
'tagName',
t."tagName"
)
) as "tagsArray"
FROM
"_ServerToTag" AS st
INNER JOIN
"Tag" AS t
ON
t.id = st."B"
GROUP BY
st."A"
) AS tag
ON
tag."A" = s.id
-- Count votes
LEFT JOIN
(
SELECT
"serverId"
, COUNT(*) AS "VOTES"
FROM
"Vote" as v
WHERE
v."createdAt" >= '2020-12-01' AND
v."createdAt" < '2021-01-01'
GROUP BY "serverId"
) as v
ON
s.id = v."serverId"
OFFSET 0 LIMIT 25;
这完全相同,但通过直接在联接中选择我需要的内容,它更具可读性,而且我可以更好地控制返回的数据。