我有一个非常大的结果集,包含分布在几个表中的近2 GB的产品数据,每个表总共有大约500,000条记录。我需要处理每个记录以导出到一组文件。
以下内容会在尝试保存结果集时使服务器崩溃,因此我不得不切换到仅创建查询以仅获取与查询结果匹配的每条记录的主ID,然后对每个记录执行第二次查询获得该单个产品的主要ID。由于所有这些次要查询,这非常低效且数据库密集。
这是崩溃它的查询和代码。我怎么能不成功呢?
$query =
"SELECT SQL_NO_CACHE SQL_BIG_RESULT
products.*,
inventory.*,
pricing.*,
markets.*
FROM
products,
categories,
markets,
pricing,
inventory
WHERE
products.catid = categories.id AND
markets.id = products.marketid AND
pricing.productid = products.id AND
inventory.productid = products.id AND
inventory.all_stock > 0 AND
products.sale = 'Y' AND
categories.active = 'Y' AND
inventory.last_update > UNIX_TIMESTAMP(NOW() - INTERVAL 1 DAY)
GROUP BY
products.id";
$Db = new DbConnector();
$r = $Db->query($query); // !Never gets past this point!
while ($product = $r->fetch(PDO::FETCH_ASSOC)) {
// Stuff gets done here.
}
答案 0 :(得分:0)
查询是否仅在数据库服务器上运行?如果是这样,瓶颈很可能与您的Web服务器有关,并且它与您的数据库服务器进行通信。如果您正在提取大量数据或者您被迫运行大量查询(如果您必须为检索到的每个ID运行其他查询),我建议使用存储过程(mysql将它们称为“例程” “)。 您可以在这里开始:http://net.tutsplus.com/tutorials/an-introduction-to-stored-procedures/
答案 1 :(得分:0)
你是不是只是将id字段放入临时表中然后“保湿”并分批处理完整行?
首先是只有id的临时表:
CREATE TEMPORARY TABLE tempy
SELECT SQL_NO_CACHE SQL_BIG_RESULT
products.id AS product_id,
inventory.id AS inventory_id,
pricing.id AS pricing_id,
markets.id AS markets_id
FROM
products,
categories,
markets,
pricing,
inventory
WHERE
products.catid = categories.id AND
markets.id = products.marketid AND
pricing.productid = products.id AND
inventory.productid = products.id AND
inventory.all_stock > 0 AND
products.sale = 'Y' AND
categories.active = 'Y' AND
inventory.last_update > UNIX_TIMESTAMP(NOW() - INTERVAL 1 DAY)
GROUP BY
products.id
重复此查询直到处理完所有内容,但在每个步骤中增加OFFSET值:
SELECT SQL_NO_CACHE SQL_BIG_RESULT
products.*,
inventory.*,
pricing.*,
markets.*
FROM
( SELECT *
FROM tempy
LIMIT 1000 -- slice size
OFFSET 1000*123 -- slice number
ORDER BY whatever.you.want
) AS t,
products,
inventory,
pricing,
markets
WHERE
products.id = t.products_id
inventory.id = t.inventory_id
pricing.id = t.pricing_id
markets.id = t.markets_id