我有以下表格,与此问题相关:
project
- projectId (PK)
- projectTitle
- projectDescription
- etc..
tag
- tagId
- tagName
- tagDescription
- etc...
project_tag
- projectId (PK / FK -> project.projectId)
- tagId (PK / FK -> tag.tagId)
我正在实现类似于StackOverflow的标记功能,因为它能够查看由一个或多个标记标记的项目列表(在我的例子中是项目)。 我要做的是选择所有项目,至少使用我提供给查询的所有标记进行标记,同时检索与每个项目一起使用的所有标记。
我现在的工作原理,但我觉得WHERE IN
子句中的子查询效率非常低,因为这可能会针对每一行执行,对吗?
SELECT
`project`.*,
GROUP_CONCAT( DISTINCT `tagName` ORDER BY `tagName` SEPARATOR ' ' ) as `tags`
FROM
`project`
JOIN
`project_tag`
USING ( `projectId` )
JOIN
`tag`
USING ( `tagId` )
WHERE
`projectId` IN (
SELECT
`projectId`
FROM
`project_tag`
JOIN
`tag`
USING ( `tagId` )
WHERE
`tagName` IN ( 'the', 'tags' )
GROUP BY
`projectId`
HAVING
COUNT( DISTINCT `tagName` ) = 2 # the amount of tags in the IN clause
)
GROUP BY
`projectId`
有没有办法在tag
上加入,以便我能够同时检索JOIN
ed项目的所有标记,而只有JOIN
项目(至少)匹配我提供给查询的所有标签,而不必使用WHERE IN
子句?
为了说明示例结果,请考虑以下示例项目:
projectId: 1, tags: php, cms, webdevelopment
projectId: 2, tags: php, cms, ajax
projectId: 3, tags: c#, cms, webdevelopment
搜索标签php
和cms
会产生(这些不是格式化为实际的mysql查询结果,只是为了说明相关数据):
projectId: 1, tags: php, cms, webdevelopment
projectId: 2, tags: php, cms, ajax
不只是:
projectId: 1, tags: php, cms
projectId: 2, tags: php, cms
答案 0 :(得分:1)
子查询是非相关的(它可以自行取出并执行而不会出错)所以应该执行一次。
更高效的方法是让子查询首先排除不匹配的项目,然后将其与其他表联接起来。这样的事情: -
SELECT project.*, GROUP_CONCAT( DISTINCT tagName ORDER BY tagName SEPARATOR ' ' ) as tags
FROM (SELECT projectId, COUNT( DISTINCT tagName ) AS TagCount
FROM tag
INNER JOIN project_tag USING (tagId)
WHERE tagName IN ( 'the', 'tags' )
GROUP BY projectId
HAVING TagCount = 2) Sub1
INNER JOIN project ON Sub1.projectId = project.projectId
INNER JOIN project_tag USING (projectId)
INNER JOIN tag USING (tagId)
GROUP BY projectId
我假设你有一个tagName索引。