这是我的表:
+------+--------+-------------------+
| user | item | date_time |
| 10 | 01 | 10-10-10 20:10:05 |
| 10 | 02 | 10-10-10 20:10:10 |
| 10 | 03 | 10-10-10 20:10:10 |
| 20 | 02 | 10-10-10 20:15:10 |
| 20 | 02 | 10-10-10 20:20:10 |
| 30 | 10 | 10-10-10 20:01:10 |
| 30 | 20 | 10-10-10 20:01:20 |
| 30 | 30 | 10-10-10 20:05:20 |
+------+--------+-------------------+
我想执行一个查询,返回一个间隔1分钟内占用多个项目的用户,如下所示:
+------+--------+-------------------+
| user | item | date_time |
| 10 | 01 | 10-10-10 20:10:05 |
| 10 | 02 | 10-10-10 20:10:10 |
| 10 | 03 | 10-10-10 20:10:10 |
| 30 | 10 | 10-10-10 20:01:10 |
| 30 | 20 | 10-10-10 20:01:20 |
+------+--------+-------------------+
我该怎么做?
如果我只想显示在此输出上出现2次或更多次的用户?
示例:
+------+--------+-------------------+
| user | item | date_time |
| 10 | 01 | 10-10-10 20:10:05 |
| 10 | 02 | 10-10-10 20:10:10 |
| 10 | 03 | 10-10-10 20:10:10 |
+------+--------+-------------------+
答案 0 :(得分:1)
你必须自己加入表(让我们调用表别名T1和T2)。然后编写一个WHERE子句来仅过滤T1.user等于T2.user的行,并且T1.date_time和T2.date_time之间的差值的绝对值小于一分钟。
但问题是每一行都会被选中,因为你的表上没有主键,因此没有办法检测一行是否与自身连接。创建主键(自动编号序列将正常工作),并向WHERE子句添加一个条件T1.id <> T2.id
。
所以在(未经测试的)代码形式中:
SELECT *
FROM stuff T1, stuff T2
WHERE T1.user = T2.user
AND ABS(UNIX_TIMESTAMP(T1.date_time) - UNIX_TIMESTAMP(T2.date_time)) < 60
AND T1.id <> T2.id;
答案 1 :(得分:1)
是的,你需要按照@Will的建议添加主键(id)。
要获得每个项目一次(并且只有一次),无论1分钟窗口内有多少匹配,请尝试子查询而不是完整加入:
Select user,item,date_time from my_table t1
where id in (select t2.id from my_table t2,my_table t3
where t2.id <> t3.id and t2.user = t3.user
and abs(t2.date_time - t3.date_time) < 60)
- edit-- 对于您编辑的问题,这取决于您的意思。您的意思是“在60秒内购买3件或更多件商品的用户”或“在输出中出现2次以上的用户”。后者很容易做到:假设上述查询的结果保存在临时表(或视图)“temp1”中:
select * from my_table where user in
(select user from temp1 group by user having count(*) > 2);
答案 2 :(得分:0)
让我们尝试一下:
SELECT * FROM myTable
JOIN (SELECT MAX(date_time) AS maxi, MIN(date_time) AS mini, user AS uid FROM myTable GROUP BY uid) AS otherTable
ON date_time<=maxi AND date_time>=mini AND user = uid AND UNIX_TIMESTAMP(maxi) - UNIX_TIMESTAMP(mini) < 60
GROUP BY uid, date_time