MySQL:最小化AUTO_INCREMENT行ID之间的差距

时间:2014-01-14 15:23:01

标签: mysql python-multithreading

由于竞争条件以及MySQL增加其AUTO_INCREMENT计数器的方式,我的行ID值之间出现了明显的差距。

背景:Python脚本使用多个线程来收集存储在MySQL数据库行中的数据。每隔几个线程就会尝试存储相同的唯一行。似乎每个插入都在递增AUTO_INCREMENT计数器,即使这些插入导致重复的条目异常,并且回滚也不会回滚AUTO_INCREMENT计数器。

问题:有没有办法避免竞争条件,保留连续的行ID,并最小化差距?我意识到“不要担心差距”是一个答案。我可以想到像“检查前锁定表”和“生成另一个线程以跟踪顺序行ID”这样的有问题的解决方案,但我想知道是否有更聪明的解决方案。

我的实际表格相当复杂,但这是一个简化的测试用例:

CREATE TABLE `test` (
  `test_id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
  `test_thread` SMALLINT UNSIGNED NOT NULL,
  `test_data` varchar(80),
  PRIMARY KEY (`test_id`),
  UNIQUE KEY `test_data_unique` (test_data)
) ENGINE=InnoDB AUTO_INCREMENT=0;

以下是填充表格的示例代码:

#!/bin/env python2.7

import os,MySQLdb,multiprocessing

MYSQL_HOST = 'localhost'
MYSQL_USER = ''
MYSQL_DB   = 'packertest'
MYSQL_PASS  = ''
MYSQL_SOCKET = '/var/lib/mysql/mysql.sock'

# numerous worker threads are running simultaneously
def worker():
    # connect to the database
    db = MySQLdb.connect(host=MYSQL_HOST, user=MYSQL_USER, db=MYSQL_DB, passwd=MYSQL_PASS, unix_socket=MYSQL_SOCKET)
    cursor = db.cursor(MySQLdb.cursors.DictCursor)
    fd = open('/usr/share/dict/words')
    # for each line in the word list...
    for line in fd.xreadlines():
        # check if this row has been added
        query = cursor.execute('SELECT 1 FROM test WHERE `test_data`="%s"' % (line.strip()))
        if cursor.rowcount > 0:
            continue
        # add the row if the value isn't present
        try:
            query = cursor.execute( 'INSERT INTO test (`test_thread`,`test_data`) VALUES (%d,"%s")' % (os.getpid(),line.strip()) )
            db.commit()
        except MySQLdb.IntegrityError:
            db.rollback()

if __name__=="__main__":
    for i in range(24):
        newproc = multiprocessing.Process(target=worker,args=())
        newproc.start()

这是我得到的典型差距:

mysql> SELECT * FROM test ORDER BY test_id ASC;
+---------+-------------+-----------+
| test_id | test_thread | test_data |
+---------+-------------+-----------+
|       1 |       29073 | 1080      |
|       6 |       29068 | 10-point  |
|      10 |       29085 | 10th      |
|      13 |       29086 | 11-point  |
|      14 |       29078 | 12-point  |
|      23 |       29067 | 16-point  |
|      24 |       29073 | 18-point  |

0 个答案:

没有答案