I need to update a million records: (user balances)
Which would be best practice / optimal / faster?
Executing multiple single queries for each row:
foreach( $row as $id => $value ):
update users set balance = $value where id = $id
endforeach;
OR
Running a multiple row query:
update users
set balance =
CASE
WHEN id = 1 THEN $value1
WHEN id = 2 THEN $value2
WHEN id = 3 THEN $value3
....
WHEN id = 999998 THEN $value999998
WHEN id = 999999 THEN $value999999
WHEN id = 1000000 THEN $value1000000
END
Also, would I need to put ELSE balance
right before END
?
Thank you!
答案 0 :(得分:1)
If you truly have a million rows your second alternative, as written, won't work. There are limits on the length of SQL statements.
The concept of an UPDATE query without a WHERE clause on a large table containing valuable data is, frankly, frightening. It will update every row, whether you intended that or not.
Do these balance updates really arrive in a batch of a million? If so, it might make sense to upload them into a temporary table with id
and value
columns, then use something like this to update the original table.
UPDATE users
JOIN temptable ON users.id = temptable.id
SET users.balance = temptable.value
You can upload that temptable using LOAD DATA INFILE or some other fast method of getting the data into your MySql server. You can inspect it and make sure it's right before you overwrite a whole bunch of balance
values in your user
table.
If a whole lot of users get the same exact change to their balance, you can do some thing like this
UPDATE users
SET balance = balance + 0.10
WHERE user_category = 'gets_a_dime_a_day'
or something like that. Obviously I don't know what your WHERE
clause should contain.
答案 1 :(得分:-1)
I believe running a single query is faster as it eliminates connection time. However, you might have to do a benchmark to proof that.
Also, foreach-looping queries is considered a BAD PRACTICE
答案 2 :(得分:-1)
If you are updating millions of records then you probably don't want to do it all in one query (which would be faster) but would lock the table for a long time. Probably best to use a hybrid approach and chunk your updates into smaller groups.