我尝试使用一些长字符串来终止字段/行,但每次都有一些条目被破坏
BLOB是numpy.array
整数,保存为uint32
(numpy.array(intlist, dtype='uint32').tostring()
)
卸载(numpy.fromstring(blob, dtype='uint32')
)后,numpy报告string size must be a multiple of element size
错误。
mysql -p somedb -e "LOAD DATA LOCAL INFILE '/tmp/tmpBZRvWK' INTO TABLE `hash2protids` FIELDS TERMINATED BY '....|....' LINES TERMINATED BY '....|....\n'"
修改
我已尝试INSERT INTO hash2protids (hash, protids) VALUES ('key', LOAD_FILE('/tmp/tmpfile'));
(Insert file into mysql Blob),但LOAD_FILE仅适用于服务器上的文件。另外,我要插入3M行,所以它太慢了......
EDIT2
现在我依赖MySQLdb
executemany,因为它是:
代码看起来像这样:
step = 1000
data = []
cmd = "INSERT INTO sometable (hash, protids) VALUES (%s, %s)"
for (hash, protids) in GENERATOR:
pblob = numpy.array(protids, dtype='uint32').tostring()
data.append((hash, pblob))
if len(data)>step:
cur.executemany(cmd, data)
data = []
if data:
cur.executemany(cmd, data)
注意,我每1000个元素拆分cur.executemany,因为我遇到了MySQL错误,因为' max_allowed_packet'限制,所有条目一次执行。