在xsecound中按时间不同删除大量mysql数据库中的重复项

时间:2016-09-06 06:49:35

标签: mysql sql stored-procedures sql-query-store

我已经查看了另一个类似主题的问题,但它并没有解决我目前在下表中遇到的问题。

action_table(actionid,cookieid,intime,page)

我有以下数据

235470 ,994341855.1473047915, 2016-09-05 07:01:57, index.aspx
235471, 994341855.1473047915,  2016-09-05 07:02:00, index.aspx
235472, 994341855.1473047915,  2016-09-05 07:02:02, index.aspx
235473, 994341855.1473047915,  2016-09-05 07:02:12, home.aspx
235474, 994341855.1473047915,  2016-09-05 07:04:12, index.aspx

用户可以无限次刷新他的页面,它应该像下面一样重复,所以只有自动增量(actionid)和intime只有不同所以我只想得到如下数据

235470 ,994341855.1473047915, 2016-09-05 07:01:57, index.aspx
235473, 994341855.1473047915,  2016-09-05 07:02:12, home.aspx
235474, 994341855.1473047915,  2016-09-05 07:04:12, index.aspx

避免重复输入,如cookie id和页面相同,如果同一页面之间有任何页面,那么它应该是一个新条目。

如何选择该查询?是否有任何分组? 请帮帮我

2 个答案:

答案 0 :(得分:1)

模式

create table action_table
(   actionid int not null,
    cookieid decimal(20,10) not null,
    intime datetime not null,
    page varchar(100) not null
)charset=utf8 engine=InnoDB;

insert action_table values
(235470 ,994341855.1473047915, '2016-09-05 07:01:57', 'index.aspx'),
(235471, 994341855.1473047915, '2016-09-05 07:02:00', 'index.aspx'),
(235472, 994341855.1473047915, '2016-09-05 07:02:02', 'index.aspx'),
(235473, 994341855.1473047915, '2016-09-05 07:02:12', 'home.aspx'),
(235474, 994341855.1473047915, '2016-09-05 07:04:12', 'index.aspx');

查询

select actionid,cookieid,intime,page 
from  
(   select actionid,cookieid,intime,page, 
    @num := if(@page = page, 2, 1) as thePage, 
    @page := `page` as dummy 
    from action_table 
    cross join (select @page:='',@num:=0) xParams 
    order by actionid,cookieid,intime,page 
) as x  
where x.thePage=1 
order by actionid,cookieid,intime,page; 
+----------+----------------------+---------------------+------------+
| actionid | cookieid             | intime              | page       |
+----------+----------------------+---------------------+------------+
|   235470 | 994341855.1473047915 | 2016-09-05 07:01:57 | index.aspx |
|   235473 | 994341855.1473047915 | 2016-09-05 07:02:12 | home.aspx  |
|   235474 | 994341855.1473047915 | 2016-09-05 07:04:12 | index.aspx |
+----------+----------------------+---------------------+------------+

将MySQL变量与派生表x一起使用,如果变量@num为1,我们将为最终输出选择它。

cross join仅用于在开始时初始化变量。

答案 1 :(得分:0)

以下是使用Oracle中的分析LEAD功能的解决方案:

WITH input_data AS (
  SELECT 235470 AS actionid, 994341855.1473047915 AS cookieid, TO_DATE('2016-09-05 07:01:57', 'yyyy-mm-dd HH:MI:SS') AS intime, 'index.aspx' AS page  FROM DUAL
  UNION ALL
  SELECT 235471 AS actionid, 994341855.1473047915 AS cookieid, TO_DATE('2016-09-05 07:02:00', 'yyyy-mm-dd HH:MI:SS') AS intime, 'index.aspx' AS page  FROM DUAL
  UNION ALL
  SELECT 235472 AS actionid, 994341855.1473047915 AS cookieid, TO_DATE('2016-09-05 07:02:02', 'yyyy-mm-dd HH:MI:SS') AS intime, 'index.aspx' AS page  FROM DUAL
  UNION ALL
  SELECT 235473 AS actionid, 994341855.1473047915 AS cookieid, TO_DATE('2016-09-05 07:02:12', 'yyyy-mm-dd HH:MI:SS') AS intime, 'home.aspx' AS page  FROM DUAL
  UNION ALL
  SELECT 235474 AS actionid, 994341855.1473047915 AS cookieid, TO_DATE('2016-09-05 07:04:12', 'yyyy-mm-dd HH:MI:SS') AS intime, 'index.aspx' AS page  FROM DUAL
)
SELECT MIN(actionid) AS action_id, cookieid, MIN(intime) AS intime, page
FROM (
  SELECT input_data.*, LEAD(page, 1) OVER (ORDER BY intime) AS next_page 
  FROM input_data
)
WHERE page <> NVL(next_page, 'NULL')
GROUP BY cookieid, page, next_page
ORDER BY MIN(actionid)
;

输出:

ACTION_ID  COOKIEID     INTIME            PAGE
235472     994341855.1  05/09/2016 07:02  index.aspx
235473     994341855.1  05/09/2016 07:02  home.aspx
235474     994341855.1  05/09/2016 07:04  index.aspx