使用HAVING的MySQL查询错误地限制了结果

时间:2013-09-04 21:53:20

标签: mysql

我有一个高级搜索表单,提供了一系列过滤搜索的方法。这是一个简化的想法(不包括关键字文本输入或日期范围搜索或其他选择菜单):

Topic: 
<select><option>any</option><option>all</option></select>
[] Aging
[] Environment
[] Health
[] Hunger
[] Poverty

Document type: 
<select><option>any</option><option>all</option></select>
[] Case Study
[] Policy Brief
[] Whitepaper

如果有人在选择多个主题或文档类型时选择“任何”,则查询需要包括,例如,topic =“Aging” OR topic =“Health”。

如果有人在选择多个主题或文档类型时选择“全部”,则查询需要包括,例如,topic =“Aging” AND topic =“Health”。

我们在这些不同的过滤器之间默认为“AND”。因此,当您搜索在“老化”下分类的所有文档和分类为白皮书的所有文档时,查询为:topic =“Aging” AND doctype =“whitepaper”。

问题:我们有一个查询在搜索“any”时有效。但是当搜索“all”时,根据MySQL的“EXPLAIN”命令,我们有一个“不可能的WHERE”。 :(

当有人为主题和文档类型选择“任意”时,以下是有效的查询

SELECT 
DISTINCT * 
FROM research 
JOIN link_resource_doctype ON link_resource_doctype.resource_id = research.research_id 
JOIN doctype ON doctype.id = link_resource_doctype.doctype_id 
JOIN link_resource_issue_area ON link_resource_issue_area.resource_id = research.research_id 
JOIN issue_area ON issue_area.id = link_resource_issue_area.issue_area_id 
WHERE approved = '1' 
AND (doctype.identifier = 'case_study' OR doctype.identifier = 'whitepaper') 
AND (issue_area.identifier = 'aging' OR issue_area.identifier = 'health')

当有人为主题和文档类型选择“全部”时,这是相同的查询(如果有人选择主题或只是文档类型,这也不起作用):< / p>

SELECT 
DISTINCT * 
FROM research 
JOIN link_resource_doctype ON link_resource_doctype.resource_id = research.research_id 
JOIN doctype ON doctype.id = link_resource_doctype.doctype_id 
JOIN link_resource_issue_area ON link_resource_issue_area.resource_id = research.research_id 
JOIN issue_area ON issue_area.id = link_resource_issue_area.issue_area_id 
WHERE approved = '1' 
AND (doctype.identifier = 'case_study' AND doctype.identifier = 'whitepaper') 
AND (issue_area.identifier = 'aging' AND issue_area.identifier = 'health')

可能的解决方案,但是有一个问题:我在Stackoverflow上发现了这篇文章 - Select row belonging to multiple categories - 其中包含一个我认为可以在有人选择“全部”时解决问题的查询。这是:

SELECT 
DISTINCT * 
FROM research 
JOIN link_issue_area ON link_issue_area.resource_id = research.research_id 
JOIN link_doctype ON link_doctype.resource_id = research.research_id 
WHERE issue_area.identifier IN ('aging', 'health')
AND 
doctype_id.identifier IN ('case_study', 'whitepaper')
GROUP BY research.research_id
HAVING COUNT(DISTINCT issue_area.identifier) = 2 
AND 
COUNT(DISTINCT doctype.identifier) = 2

问题:除一个问题外,此查询似乎对“任何”或“全部”都有效。假设文档在“老龄化”,“健康”和“贫困”下进行分类,但搜索者仅检查了“老龄化与健康”。在已检查的两个主题下分类的文档以及未检查的Poverty将不会出现在搜索结果列表中。我认为这是因为HAVING COUNT(DISTINCT issue_area.identifier)= 2 - 2排除了任何实际上有COUNT超过2的文档。是否有解决办法?或者在这里使用更好的查询?

任何见解,想法和帮助都非常感谢!谢谢!

这里也是一个SQLfiddle:http://sqlfiddle.com/#!2/847362/1

2 个答案:

答案 0 :(得分:0)

我真的不明白这个问题,因为你没有显示预期的输出。但据我所知,这是到目前为止我所做的。请评论错误是什么:

可能的解决方案,但可能有效:

SELECT   research.research_id AS resource_id, research.title
FROM     research
JOIN     link_issue_area ON link_issue_area.resource_id = research.research_id
JOIN     link_doctype ON link_doctype.resource_id = research.research_id
JOIN (SELECT resource_id, COUNT(DISTINCT issue_area_id) AS ISSUE_COUNT FROM link_issue_area
      GROUP BY resource_id) TB1_COUNT ON TB1_COUNT.resource_id = research.research_id
JOIN (SELECT resource_id, COUNT(DISTINCT doctype_id) AS DOCTYPE_COUNT FROM link_doctype
     GROUP BY resource_id) TB2_COUNT ON TB2_COUNT.resource_id = research.research_id
WHERE    issue_area_id IN (5,10)
AND       doctype_id IN (3,18)
AND   TB1_COUNT.ISSUE_COUNT = 2
AND TB2_COUNT.DOCTYPE_COUNT = 2
GROUP BY resource_id
LIMIT 0,1

这是SQLFiddle

答案 1 :(得分:0)

您只需要SQLFiddle中的现有查询,只要您只需要动态包含HAVING条件,并且需要所有选项。像这样:

SELECT   research.research_id AS resource_id, research.title 
FROM     research 
JOIN     link_issue_area ON link_issue_area.resource_id = research.research_id 
JOIN     link_doctype ON link_doctype.resource_id = research.research_id 
WHERE    issue_area_id IN (5,10) /* dynamically-generated list of issues */
  AND    doctype_id IN (3,18) /* dynamically-generated list of doc types */
GROUP BY resource_id
HAVING 1=1
  AND    COUNT(DISTINCT issue_area_id) = 2 /* dynamically-generated count of
      user-selected issues - only included when all specified issues required */
  AND    COUNT(DISTINCT doctype_id) = 2 /* dynamically-generated count of
      user-selected doc types - only included when all specified types required*/

包含虚拟条件1=1意味着您始终可以包含HAVING子句,即使这些选项都不是all

因此,动态生成的查询将返回包含所有问题5和10以及所有文档类型3和18的资源,如下所示:

SELECT   research.research_id AS resource_id, research.title 
FROM     research 
JOIN     link_issue_area ON link_issue_area.resource_id = research.research_id 
JOIN     link_doctype ON link_doctype.resource_id = research.research_id 
WHERE    issue_area_id IN (5,10)
  AND    doctype_id IN (3,18)
GROUP BY resource_id
HAVING 1=1
  AND    COUNT(DISTINCT issue_area_id) = 2
  AND    COUNT(DISTINCT doctype_id) = 2

SQLFiddle here

动态生成的查询返回具有任何问题10和20以及任何文档类型15和18的资源,如下所示:

SELECT   research.research_id AS resource_id, research.title 
FROM     research 
JOIN     link_issue_area ON link_issue_area.resource_id = research.research_id 
JOIN     link_doctype ON link_doctype.resource_id = research.research_id 
WHERE    issue_area_id IN (10,20)
  AND    doctype_id IN (15,18)
GROUP BY resource_id
HAVING 1=1

SQLFiddle here