我想弄清楚用户最常回答的语言,并按user_id
返回,language_id
他们最多回答,有多少次回答。
我从SELECT
开始返回这些结果的表/子表:
Table: `sub-selected`
`user_id` `language_id` `answers`
1 1 1
2 1 1
1 2 5
2 2 2
1 4 3
1 5 1
此表返回user_id
,language_id
以及用户已回答language_id
的次数。我用这个查询得到它:
SELECT t1.user_id, t2.to_language_id, COUNT(t2.to_language_id) as answers
FROM translation_results as t1
LEFT JOIN translations as t2
ON t2.translation_id = t1.translation_id
GROUP BY t2.to_language_id, t1.user_id
表结构是:
Table: `translations`
`translation_id` `from_phrase_id` `to_language_id`
Table: `translation_results`
`translation_id` `result_id` PRI-AI `user_id`
translations
表存储了所有请求的翻译,translation_results表存储了这些翻译的答案以及相应的user_id
。
因此,为了总结表并获取user_id,他们回答最多language_id
,以及他们在language_id
中回答了多少次,我使用了:
SELECT t1.user_id, t1.to_language_id, MAX(t1.answers)
FROM (
//The sub-table
SELECT t1.user_id, t2.to_language_id, COUNT(t2.to_language_id) as answers
FROM translation_results as t1
LEFT JOIN translations as t2
ON t2.translation_id = t1.translation_id
GROUP BY t2.to_language_id, t1.user_id
) as t1
GROUP BY t1.user_id, t1.to_language_id
但是这不会将表格折叠成所需的结构而是返回:
Table: `sub-selected`
`user_id` `language_id` `answers`
1 1 1
1 2 5
1 4 3
1 5 1
2 1 1
2 2 2
我知道它受two clauses
组的影响,但如果我只按user_id分组并且在我选择的列中没有包含to_language_id,我就无法知道哪个相应的language_id最多回答。我也尝试过子查询和一些连接,但我发现无论选择哪一列,我总是需要使用MAX(t1.answers)
,因此破坏了我正确地整理group by
的希望。如何正确折叠查询,而不是让group by
找到MAX()
和user_id
的所有唯一to_language_id
组合?
答案 0 :(得分:1)
获得:
user_id
,他们回答最多language_id
,以及他们多少次 用那个language_id回答
你可以使用变量:
SELECT user_id, language_id, answers
FROM (
SELECT user_id, language_id, answers,
@rn:= IF(@uid = user_id,
IF(@uid:=user_id, @rn:=@rn+1, @rn:=@rn+1),
IF(@uid:=user_id, @rn:=1, @rn:=1)) AS rn
FROM (SELECT t1.user_id, t2.to_language_id AS language_id,
COUNT(t2.to_language_id) as answers
FROM translation_results as t1
LEFT JOIN translations as t2
ON t2.translation_id = t1.translation_id
GROUP BY t2.to_language_id, t1.user_id
) t
CROSS JOIN (SELECT @rn:=0, @uid:=0) AS vars
ORDER BY user_id, answers DESC
) s
WHERE s.rn = 1
上述查询中存在一个限制:如果有多个language_id
共享user_id
的最大答案数,则只会返回一个。
另一种方法是使用两次查询作为派生表:
SELECT t1.user_id, language_id, t1.answers
FROM (SELECT t1.user_id, t2.to_language_id AS language_id,
COUNT(t2.to_language_id) as answers
FROM translation_results as t1
LEFT JOIN translations as t2
ON t2.translation_id = t1.translation_id
GROUP BY t2.to_language_id, t1.user_id ) t1
INNER JOIN (
SELECT user_id, MAX(answers) AS answers
FROM (SELECT t1.user_id, t2.to_language_id,
COUNT(t2.to_language_id) as answers
FROM translation_results as t1
LEFT JOIN translations as t2
ON t2.translation_id = t1.translation_id
GROUP BY t2.to_language_id, t1.user_id
) t
GROUP BY user_id ) t2
ON t1.user_id = t2.user_id AND t1.answers = t2.answers
此查询没有上一个查询的限制,但与前一个查询相比可能效率较低。
答案 1 :(得分:0)
如果我不明白你的问题,你应该用子查询的结果定义一个临时表或派生表,让我们调用sub_selected
,然后你应该这样做:
SELECT t1.user_id, t1.to_language_id, answers
FROM sub_selected as t1
WHERE t1.answers =
(SELECT MAX(answers)
FROM sub_selected t2
WHERE t1.user_id = t2.user_id and t1.to_language_id = t2.language_id)