这需要一点解释(更多因为我不能在问题的标题中使用"问题")
我有一个红娘测验与下表(简化):
CREATE TABLE `Quiz` (
`quiz_id` int(10) unsigned NOT NULL,
`code` varchar(20) DEFAULT NULL,
`title` varchar(50) DEFAULT NULL,
PRIMARY KEY (`quiz_id`),
UNIQUE KEY `Quiz_1` (`code`)
);
CREATE TABLE `Quiz_Question` (
`quiz_id` int(10) unsigned NOT NULL,
`question_id` int(10) unsigned NOT NULL,
`question` varchar(250) DEFAULT NULL,
`type` int(10) unsigned NOT NULL, -- Lookup table of type of question: booean, radio, select, multiselect
PRIMARY KEY (`question_id`)
);
CREATE TABLE `Quiz_Answer` (
`question_id` int(10) unsigned NOT NULL,
`answer_id` int(10) unsigned NOT NULL,
`answer` varchar(250) DEFAULT NULL,
PRIMARY KEY (`answer_id`)
);
CREATE TABLE `Quiz_Response` (
`user_id` int(10) unsigned NOT NULL,
`quiz_id` int(10) unsigned NOT NULL,
`question_id` int(10) unsigned NOT NULL,
`answer_id` int(10) unsigned DEFAULT NULL,
UNIQUE KEY `Response_1` (`user_id`,`question_id`,`answer_id`),
KEY `Response_2` (`question_id`,`answer_id`)
);
到目前为止一切都很简单。
以前,查询就像这样(简化):
SELECT u.login, COUNT( u.user_id ) AS matches, ...
FROM User u
INNER JOIN Quiz_Response rep ON u.user_id = rep.user_id
WHERE u.active = 1
AND (
(rep.question_id = 3 AND rep.answer_id IN (20, 24)) OR
(rep.question_id = 10 AND rep.answer_id IN (83,84,85))
)
GROUP BY u.user_id
HAVING matches >= 2
ORDER BY u.login
注意: 我已从CREATE TABLE和查询中删除了某些内容,例如是否处于活动状态,显示顺序,阻止的用户,日期范围等等核心问题。
因此,如果用户使用20或24回答了问题3,他们会在结果中显示一次,如果他们以83,84或85回答问题10,则会再次出现问题。然后查询计算任何给定用户显示的次数,如果它等于或大于试图匹配的问题数,则认为匹配(在这种情况下,匹配者检查了两个可能的问题,因此它们至少应该是2个条目(匹配)。
我的问题是我引入了多项选择匹配。这有一个问题的最终结果可以有多个匹配,这会导致计数。
因此,如果搜索者说他们正在寻找用A,B或C回答问题5的人,并且用户说他们喜欢A,B和C,那么这将成为三个匹配基本上取消其他两个问题(搜索了三件事,并从同一问题中找到了三场比赛)。
所以我问的问题是如何针对每个给定的问题检查,即使单个问题的多个答案多次匹配,它也只能得1分。
希望一切都有意义。
答案 0 :(得分:0)
不依赖于u.user_id
,而是依靠distinct rep.question_id
:
SELECT u.login, u.user_id, COUNT(distinct rep.question_id) AS matches
FROM User u
INNER JOIN Quiz_Response rep ON u.user_id = rep.user_id
WHERE u.active = 1
AND (
(rep.question_id = 3 AND rep.answer_id IN (20, 24)) OR
(rep.question_id = 10 AND rep.answer_id IN (83,84,85))
)
GROUP BY u.user_id
HAVING matches >= 2
ORDER BY u.login;
所以如果我的Quiz_Response
表看起来像这样:
+-------------+---------+-------------+-----------+---------+
| response_id | quiz_id | question_id | answer_id | user_id |
+-------------+---------+-------------+-----------+---------+
| 1 | 1 | 1 | 4 | 3 |
| 2 | 2 | 3 | 20 | 2 |
| 3 | 2 | 3 | 24 | 2 |
| 4 | 4 | 10 | 83 | 1 |
| 5 | 4 | 10 | 84 | 1 |
| 6 | 4 | 10 | 85 | 1 |
| 7 | 2 | 3 | 20 | 4 |
| 8 | 1 | 1 | 1 | 4 |
| 9 | 2 | 3 | 24 | 4 |
| 10 | 4 | 10 | 83 | 4 |
+-------------+---------+-------------+-----------+---------+
以上查询的输出将为:
+---------------------+---------+---------+
| login | user_id | matches |
+---------------------+---------+---------+
| 2018-01-01 00:00:00 | 4 | 2 |
+---------------------+---------+---------+