所以,假设我有一个名为questions
的MySQL表,其中包含以下布局和数据:
id user_id answers created
1 1 35 <unix_timestamp>
2 1 30 <unix_timestamp>
3 1 25 <unix_timestamp>
4 2 20 <unix_timestamp>
5 2 15 <unix_timestamp>
6 3 10 <unix_timestamp>
7 4 9 <unix_timestamp>
8 5 8 <unix_timestamp>
9 6 7 <unix_timestamp>
10 7 6 <unix_timestamp>
此时我做了一个简单的查询,以获得过去两天最常回答的5个问题:
SELECT * FROM `questions`
WHERE `created` > UNIX_TIMESTAMP()-86400*2
ORDER BY `answers` DESC
LIMIT 5;
它工作正常,但在某些情况下,此查询的结果只有来自一个或两个用户的问题,当一个非常受欢迎的人向他们的粉丝询问了很多问题并在两天内收到答案。现在我需要更改查询,以便每个用户只获得一个结果。
换句话说,现在表格上的结果是:
id user_id answers created
1 1 35 <unix_timestamp>
2 1 30 <unix_timestamp>
3 1 25 <unix_timestamp>
4 2 20 <unix_timestamp>
5 2 15 <unix_timestamp>
我需要更改查询以获得以下结果:
id user_id answers created
1 1 35 <unix_timestamp>
4 2 20 <unix_timestamp>
6 3 10 <unix_timestamp>
7 4 9 <unix_timestamp>
8 5 8 <unix_timestamp>
我尝试过一些我在互联网上找到的东西,但没有什么对我有用。我甚至不确定是否需要使用分组,加入,子查询,甚至是其他东西。
答案 0 :(得分:2)
我们可以使用Group by来获取每个用户的答案:
df_MG_where = df_MG[((df_MG.time_dim_id >= df_MG.call_dim_id) & (df_MG.time_dim_id <= df_MG.evt_dim_id)) | (df_MG.evt_dim_id ????)]
以下是示例SQL Fiddle
答案 1 :(得分:1)
用于模拟行号的用户变量
SELECT id, user_id, answers,created FROM (
SELECT
id, user_id, answers,created,
@rank:= IF(@user_id = user_id, @rank+ 1, 1) AS rank,
@user_id := user_id AS x
FROM t
ORDER BY user_id,answer DESC
) AS y WHERE rank <=1 ORDER BY id LIMIT 5
答案 2 :(得分:0)
使用子SELECT过滤掉每个用户的所有非最大答案:
select q.* from
questions q,
(select user_id, max(answers) max from questions group by user_id) r
where
q.user_id = r.user_id and
q.answers = r.max
order by q.answers desc limit 5;