我有2个MySQL表:
的网页:
id INT,
name VARCHAR,
view_count INT
PAGE_VIEWS:
page_id INT
viewed_at DATETIME,
ip VARCHAR(15),
processed INT(1)
当有人查看某个页面时,会在page_views
表中添加一条带有processed = 0
标记的新记录。
每N分钟未处理的记录应该被提取,计算并添加到view_count
表的pages
属性中。
没有交易和锁定,它看起来像:
SELECT page_id, COUNT(ip) AS cnt FROM page_views WHERE processed = 0 GROUP BY page_id;
UPDATE page_views SET processed = 1 WHERE processed = 0;
UPDATE pages SET view_count = view_count + ...
但在这种情况下,可以在前2个查询之间将新记录添加到page_views
。所以我们应该锁定page_views
表进行编写。
但是,当第三个查询由于某种原因失败时,还有另外一个案例可能,但processed
表已在page_views
表上更新,下次不会考虑这些记录。
这意味着我们也应该使用交易。
所以算法应该如下:
-- lock `page_views` for writing
-- start transaction
SELECT page_id, COUNT(ip) AS cnt FROM page_views WHERE processed = 0 GROUP BY page_id;
UPDATE page_views SET processed = 1 WHERE processed = 0;
-- unlock `page_views`
UPDATE pages SET view_count = view_count + ...
-- commit transaction
但根据MySQL documentation,LOCK TABLES
不是交易安全的,所以我正在寻找处理这种情况的正确方法。
回答评论:
在这种情况下,仅使用事务的原因示例不起作用:
mysql_1> SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
mysql_1> START TRANSACTION;
mysql_1> SELECT * FROM page_views;
+---------+------------+-----------+-----------+
| page_id | date | ip | processed |
+---------+------------+-----------+-----------+
| 2832 | 2015-09-10 | 127.0.0.1 | 1 |
| 2832 | 2015-09-11 | 127.0.0.1 | 1 |
+---------+------------+-----------+-----------+
mysql_2> INSERT INTO page_views SET page_id = 2832, date='2015-09-12', ip='127.0.0.1', processed=0;
mysql_1> UPDATE page_views SET processed = 1 WHERE processed = 0;
mysql_1> COMMIT;
mysql_1> SELECT * FROM page_views;
+---------+------------+-----------+-----------+
| page_id | date | ip | processed |
+---------+------------+-----------+-----------+
| 2832 | 2015-09-10 | 127.0.0.1 | 1 |
| 2832 | 2015-09-11 | 127.0.0.1 | 1 |
| 2832 | 2015-09-12 | 127.0.0.1 | 1 | <-- this should be 0
+---------+------------+-----------+-----------+