我有一张表格,其中包含有关参加某些活动的数据。每次用户发出新的出勤时我都会在表中有出席的数据,信息是这样的:
mysql> SELECT id_branch_channel, id_member, attendance, timestamp, id_member FROM view_event_attendance WHERE id_event = 782;
+-------------------+-----------+------------+------------+-----------+
| id_branch_channel | id_member | attendance | timestamp | id_member |
+-------------------+-----------+------------+------------+-----------+
| 1326 | 131327 | 459 | 1363208604 | 131327 |
| 1326 | 131327 | 123 | 1363208504 | 131327 |
| 1326 | 131327 | 1 | 1363208459 | 131327 |
| 1326 | 93086 | 0 | NULL | 93086 |
| 1326 | 93087 | 0 | NULL | 93087 |
| 1326 | 93088 | 0 | NULL | 93088 |
| 1326 | 93093 | 0 | NULL | 93093 |
| 1326 | 99113 | 0 | NULL | 99113 |
| 1326 | 99135 | 0 | NULL | 99135 |
| 1326 | 99199 | 0 | NULL | 99199 |
| 1326 | 99200 | 0 | NULL | 99200 |
| 1326 | 131324 | 0 | NULL | 131324 |
| 1326 | 85850 | 0 | NULL | 85850 |
| 1326 | 93085 | 0 | NULL | 93085 |
+-------------------+-----------+------------+------------+-----------+
14 rows in set (0.00 sec)
(这实际上是一个视图,因此某些字段为空)。
我可以通过id_member进行分组,这样每个成员只能得到一行(也就是说,只有用户发送的最后一次出席)。但是,当我这样做时,我收到了用户发送的第一次出席,而不是最后一次出席。
mysql> SELECT id_branch_channel, id_member, attendance, timestamp, id_member FROM view_event_attendance WHERE id_event = 782 GROUP BY id_event,id_member;
+-------------------+-----------+------------+------------+-----------+
| id_branch_channel | id_member | attendance | timestamp | id_member |
+-------------------+-----------+------------+------------+-----------+
| 1326 | 131327 | 1 | 1363208459 | 131327 |
| 1326 | 93086 | 0 | NULL | 93086 |
| 1326 | 131324 | 0 | NULL | 131324 |
| 1326 | 93087 | 0 | NULL | 93087 |
| 1326 | 93088 | 0 | NULL | 93088 |
| 1326 | 93093 | 0 | NULL | 93093 |
| 1326 | 99113 | 0 | NULL | 99113 |
| 1326 | 99135 | 0 | NULL | 99135 |
| 1326 | 85850 | 0 | NULL | 85850 |
| 1326 | 99199 | 0 | NULL | 99199 |
| 1326 | 93085 | 0 | NULL | 93085 |
| 1326 | 99200 | 0 | NULL | 99200 |
+-------------------+-----------+------------+------------+-----------+
12 rows in set (0.00 sec)
我已经尝试添加ORDER BY clausules,但它们根本不起作用......任何想法?
提前致谢!
编辑:这是创建表格的脚本
CREATE OR REPLACE VIEW view_event_attendance
AS
SELECT
tbl_event.id_event,
tbl_member_event.id_member,
tbl_event.id_branch_channel,
tbl_member_event_attendance.id_member_event_attendance,
IF(ISNULL(tbl_member_event_attendance.attendance), 0, tbl_member_event_attendance.attendance) AS attendance,
tbl_member_event_attendance.timestamp
FROM
tbl_event
INNER JOIN
tbl_member_event ON tbl_member_event.id_event = tbl_event.id_event
LEFT OUTER JOIN
tbl_member_event_attendance ON tbl_member_event_attendance.id_member_event = tbl_member_event.id_member_event
ORDER BY
tbl_member_event_attendance.timestamp DESC;
编辑2:
非常感谢MichaelBenjamin,但使用子查询时的问题是视图的大小:
mysql> DESCRIBE SELECT id_branch_channel, id_member, attendance, timestamp, id_member
-> FROM (select * from view_event_attendance order by timestamp desc) as whatever
-> WHERE id_event = 782
-> GROUP BY id_event,id_member;
+----+-------------+-----------------------------+--------+-----------------+-----------------+---------+------------------------------------------------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------------------+--------+-----------------+-----------------+---------+------------------------------------------------+-------+----------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 16755 | Using where; Using temporary; Using filesort |
| 2 | DERIVED | tbl_member_event | index | id_event | id_event | 8 | NULL | 16346 | Using index; Using temporary; Using filesort |
| 2 | DERIVED | tbl_event | eq_ref | PRIMARY | PRIMARY | 4 | video_staging.tbl_member_event.id_event | 1 | |
| 2 | DERIVED | tbl_member_event_attendance | ref | id_event_member | id_event_member | 4 | video_staging.tbl_member_event.id_member_event | 1 | Using index |
+----+-------------+-----------------------------+--------+-----------------+-----------------+---------+------------------------------------------------+-------+----------------------------------------------+
4 rows in set (0.08 sec)
正如您所看到的,我的表中有很多行,因此我不想使用子查询......
编辑3:
但是在子查询中添加WHERE看起来更好......
mysql> DESCRIBE SELECT id_branch_channel, id_member, attendance, timestamp, id_member
-> FROM (select * from view_event_attendance where id_event = 782 order by timestamp desc) as whatever
-> WHERE id_event = 782
-> GROUP BY id_event,id_member;
+----+-------------+-----------------------------+-------+-----------------+-----------------+---------+------------------------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------------------+-------+-----------------+-----------------+---------+------------------------------------------------+------+----------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 14 | Using where; Using temporary; Using filesort |
| 2 | DERIVED | tbl_event | const | PRIMARY | PRIMARY | 4 | | 1 | Using temporary; Using filesort |
| 2 | DERIVED | tbl_member_event | ref | id_event | id_event | 4 | | 12 | Using index |
| 2 | DERIVED | tbl_member_event_attendance | ref | id_event_member | id_event_member | 4 | video_staging.tbl_member_event.id_member_event | 1 | Using index |
+----+-------------+-----------------------------+-------+-----------------+-----------------+---------+------------------------------------------------+------+----------------------------------------------+
4 rows in set (0.01 sec)
如果我找不到其他不使用子查询的内容,我想我会选择这个作为答案......
编辑4
在看到答案中的评论后,我决定选择另一个作为答案。这是两个查询的DESCRIBE,我认为很明显什么是最佳解决方案:
mysql> DESCRIBE SELECT
-> id_branch_channel,
-> id_member,
-> attendance,
-> timestamp,
-> id_member
-> FROM view_event_attendance AS t1
-> WHERE id_event = 782
-> AND timestamp = (SELECT MAX(timestamp)
-> FROM view_event_attendance AS t2
-> WHERE t1.id_member = t2.id_member
-> AND t1.id_event = t2.id_event
-> GROUP BY id_event, id_member)
-> OR timestamp IS NULL
-> GROUP BY id_event, id_member;
+----+--------------------+-----------------------------+--------+--------------------+--------------------------+---------+------------------------------------------------+------+-----------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-----------------------------+--------+--------------------+--------------------------+---------+------------------------------------------------+------+-----------------------------------------------------------+
| 1 | PRIMARY | tbl_event | index | PRIMARY | id_member_branch_channel | 4 | NULL | 208 | Using index; Using temporary; Using filesort |
| 1 | PRIMARY | tbl_member_event | ref | id_event | id_event | 4 | video_staging.tbl_event.id_event | 64 | Using index |
| 1 | PRIMARY | tbl_member_event_attendance | ref | id_event_member | id_event_member | 4 | video_staging.tbl_member_event.id_member_event | 1 | Using where; Using index |
| 2 | DEPENDENT SUBQUERY | tbl_event | eq_ref | PRIMARY | PRIMARY | 4 | func | 1 | Using where; Using index; Using temporary; Using filesort |
| 2 | DEPENDENT SUBQUERY | tbl_member_event | eq_ref | id_event,id_member | id_event | 8 | video_staging.tbl_event.id_event,func | 1 | Using where; Using index |
| 2 | DEPENDENT SUBQUERY | tbl_member_event_attendance | ref | id_event_member | id_event_member | 4 | video_staging.tbl_member_event.id_member_event | 1 | Using where; Using index |
+----+--------------------+-----------------------------+--------+--------------------+--------------------------+---------+------------------------------------------------+------+-----------------------------------------------------------+
6 rows in set (0.00 sec)
mysql> DESCRIBE SELECT *
-> FROM (SELECT id_branch_channel, id_member, attendance, timestamp, id_event
-> FROM view_event_attendance
-> WHERE id_event = 782
-> ORDER BY timestamp desc
-> ) as whatever
-> GROUP BY id_event,id_member;
+----+-------------+-----------------------------+-------+-----------------+-----------------+---------+------------------------------------------------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------------------+-------+-----------------+-----------------+---------+------------------------------------------------+------+---------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 14 | Using temporary; Using filesort |
| 2 | DERIVED | tbl_event | const | PRIMARY | PRIMARY | 4 | | 1 | Using temporary; Using filesort |
| 2 | DERIVED | tbl_member_event | ref | id_event | id_event | 4 | | 12 | Using index |
| 2 | DERIVED | tbl_member_event_attendance | ref | id_event_member | id_event_member | 4 | video_staging.tbl_member_event.id_member_event | 1 | Using index |
+----+-------------+-----------------------------+-------+-----------------+-----------------+---------+------------------------------------------------+------+---------------------------------+
4 rows in set (0.00 sec)
答案 0 :(得分:7)
使用id_member的简单组,但请选择:
substring(max(concat(from_unixtime(timestamp),attendance)) from 20) as attendance
这会附加到组中每一行的时间戳,以便能够使用max()选择所需的时间戳/出勤率,然后仅提取出勤率。
concat()
返回的是格式化时间戳(YYYY-mm-dd HH:MM:SS)的19个字符,其中附加的出勤率从字符20开始; substring(... from 20)
只获得该组(字符串)最大值的出席率。您可以按
select concat(from_unixtime(timestamp),attendance), timestamp, attendance
更好地了解它如何使用max来获得正确的出勤率。
答案 1 :(得分:3)
SELECT id_branch_channel, id_member, attendance, timestamp, id_member
FROM (select * from view_event_attendance order by timestamp desc) as whatever
WHERE id_event = 782
GROUP BY id_event,id_member;
编辑:这可能会带来更好的效果:
SELECT *
FROM (SELECT id_branch_channel, id_member, attendance, timestamp, id_member
FROM view_event_attendance
WHERE id_event = 782
ORDER BY timestamp desc
) as whatever
GROUP BY id_event,id_member;
只要结果集适合Innodb_buffer_pool,就不会出现明显的性能下降。
答案 2 :(得分:2)
我看到了JOINS
和Subquerys
的答案,但我相信一个简单的HAVING
子句可以解决问题:
SELECT
id_branch_channel,
id_member,
attendance,
timestamp,
id_member
FROM view_event_attendance
WHERE id_event = 782
GROUP BY id_event, id_member
HAVING MAX(timestamp) OR timestamp IS NULL;
编辑:如果您还要包含这些行,请添加对IS NULL的检查。
编辑2:当您已经将其过滤为1个事件时,是否需要按id_event进行分组?
编辑3:不知道为什么downvote this sql fiddle显示它有效。
编辑4:我要道歉,@ ysth是正确的,SQL Fiddle无法正常工作。我当之无愧的-1,但是当你投票时至少解释为什么我也可以自己学习。
以下工作,但不幸的是它再次有一个子查询,并且不会比这里发布的其他解决方案表现更好。
SELECT
id_branch_channel,
id_member,
attendance,
timestamp,
id_member
FROM view_event_attendance AS t1
WHERE id_event = 782
AND timestamp = (SELECT MAX(timestamp)
FROM view_event_attendance AS t2
WHERE t1.id_member = t2.id_member
AND t1.id_event = t2.id_event
GROUP BY id_event, id_member)
OR timestamp IS NULL
GROUP BY id_event, id_member;
答案 3 :(得分:2)
SUBSTRING_INDEX(SUBSTRING_INDEX(group_concat(%requiredfield%),',',count(*)),',', - 1)
这将从任何group_concat中获取“必填字段”的最后一个值,如果未分类,它将是默认情况下表格中的最后一个值。
可以使用group_concat_ws来考虑可能的空字段。
答案 4 :(得分:1)
这是一个选项(未经测试):
SELECT v.id_branch_channel, v.id_member, v.attendance, v.timestamp, v.id_member
FROM view_event_attendance v
JOIN (
SELECT id_event, id_member, MAX(attendance) maxattendance
FROM view_event_attendance
GROUP BY id_event, id_member ) m ON
v.id_event = m.id_event AND
v.id_member = m.id_member AND
v.attendance = m.maxattendance
WHERE v.id_event = 782
GROUP BY v.id_member;
这个概念是获取时间戳的MAX()
并将该字段用于您视图中的JOIN
。您可能不需要所有字段 - 实际上取决于您的表结构。但这应该让你朝着正确的方向前进。
答案 5 :(得分:-1)
执行此操作的一种方法是使用窗口函数和子查询,如果将选项列表中的条目添加为row_number() over (partition by id_member order by timestamp desc)
,这将解析为按时间戳排序行的数字(1表示最早的行) )在每个id_member组中分组(如果这没有意义,则运行它,它将很清楚)。然后,您可以从中选择额外列= 1的子查询,该列仅选择每个组中具有最高时间戳的行。