此脚本需要永远运行,有时甚至会使整个服务器崩溃
shop_designs
有3000多个条目,shop_tshirts
有超过50,000个条目
如何优化它以加快执行速度?
$query = "SELECT * FROM shop_designs";
$res = mysql_query($query) or die(mysql_error());
while($row = mysql_fetch_array($res)) {
$design = $row["design_id"];
$rating = $row["rating"];
$hits = $row["hits"];
$retours = $row["returns"];
$sales = $row["sales"];
$plussales = $row["sales_plus"];
$featured = $row["featured"];
$created_timestamp = $row["created_timestamp"];
$returns = $row["returns"];
$wishlisted = $row["wishlisted"];
$vector = $row["vector"];
$sales = $sales + $plussales - $returns;
if ($sales > 10) {
$bonus = $sales * 100;
} else {
$bonus = $sales * 50;
}
$unan = 60 * 60 * 24 * 365;
$unjour = 60 * 60 * 24;
$age_du_design = time() - $created_timestamp;
$age_du_design_en_jours = $age_du_design / $unjour;
$age_du_design_an = $age_du_design / $unan;
if ($age_du_design < $unan) {
$age_du_design = $unan;
$age_du_design_an = "1";
}
if ($sales == "1") {
$sales_per_year = $sales / $age_du_design_an;
} else {
$sales_per_year = $sales / $age_du_design_an;
}
if ($sales < 1) {
$sales_per_year = "0";
}
$indicehits = ($hits / 1000) / 3;
$calculsales = $sales + $wishlisted - $retours;
$cote = $calculsales + $indicehits;
$cote = round($cote, 2);
if ($vector == "1") {
$cote * 1.25;
}
echo "Update D#$design.......";
$yquery = "UPDATE shop_designs SET rating='$cote', sales_per_year='$sales_per_year' WHERE design_id='$design'";
mysql_query($yquery) or die(mysql_error());
$rating_hits = $hits + $bonus;
$rating_hits = $rating_hits / 2;
$rating_hits = round($rating_hits);
$zyquery = "UPDATE shop_tshirts SET wishlisted='$wishlisted', rating='$cote', hits='$hits', rating_hits='$rating_hits',featured='$featured',tshirt_sales='$sales',sales_per_year='$sales_per_year' WHERE design_id='$design'";
mysql_query($zyquery) or die(mysql_error());
}
答案 0 :(得分:2)
通过痛苦的行(RBAR)处理每一行,每行都会变慢。该算法的设计完全忽略了SQL处理集数据的能力。
如何重新设计此项以获得更好的性能:重写此项以运行更少的SQL UPDATE
语句。
在我们开始之前......代码中有一些奇怪的逻辑。例如,结果是什么:
if ($vector == "1") {
$cote * 1.25;
}
我们看到$cote
乘以1.25
,但该操作的返回不会存储在任何位置。结果被丢弃了。
为什么我们需要在这里进行条件测试:
if ($sales == "1") {
$sales_per_year = $sales / $age_du_design_an;
} else {
$sales_per_year = $sales / $age_du_design_an;
}
因此,如果某些条件为真,我们会为$sales_per_year
指定一个值。否则,我们将完全相同的值分配给$sales_per_year
。为什么我们需要条件测试?
整个过程中,“年龄”以秒为单位,年龄以小时为单位,年龄以年为单位......基本上可以归结为a)中最大的一年,或b)计算出的年龄。
从$calculsales
计算$retours
减去$returns
(又名$sales
)。这不是无效的,但有点好奇,因为先前已从$returns
中减去$sales
。
除了这些问题之外,我在这里看不到任何在SQL语句中无法在SQL表达式中执行的操作。
例如:
SELECT n.*
FROM ( SELECT v.design_id
, v.wishlisted
, v.rating
, v.hits
, ROUND(v.hits + (v.sales * IF(v.sales > 10, 100, 50)) AS rating_hits
, v.featured
, v.sales
, IF(v.sales < 1, 0, v.sales / v.age_du_design_an) AS sales_per_year
, ROUND(v.sales + v.wishlisted - v.returns + v.hits/3000,2)
-- * CASE WHEN v.vector = 1 THEN 1.25 ELSE 1.00 END
AS cote
FROM (
SELECT d.design_id
, d.rating
, d.hits
, d.featured
, (d.sales + d.plussales - d.returns) AS sales
, GREATEST((UNIX_TIMESTAMP(NOW())-s.created_timestamp)/(60*60*24*365),1) AS age_du_design_an
, d.returns
, d.wishlisted
, d.vector
FROM shop_designs d
) v
) n
但是,不是获取单个行,然后发出大量的单个更新语句,每个design_id
一个,我们可以在上面的查询和目标表之间执行JOIN
操作更新。
我们可以先将它作为SELECT语句编写,以进行测试。然后将其转换为UPDATE语句,如下所示:
UPDATE shop_tshirts t
JOIN ( SELECT v.design_id
, v.wishlisted
, v.rating
, v.hits
, ROUND(v.hits + (v.sales * IF(v.sales > 10, 100, 50)) AS rating_hits
, v.featured
, v.sales
, IF(v.sales < 1, 0, v.sales / v.age_du_design_an) AS sales_per_year
, ROUND(v.sales + v.wishlisted - v.returns + v.hits/3000,2)
-- * CASE WHEN v.vector = 1 THEN 1.25 ELSE 1.00 END
AS cote
FROM (
SELECT d.design_id
, d.rating
, d.hits
, d.featured
, (d.sales + d.plussales - d.returns) AS sales
, GREATEST((UNIX_TIMESTAMP(NOW())-s.created_timestamp)/(60*60*24*365),1) AS age_du_design_an
, d.returns
, d.wishlisted
, d.vector
FROM shop_designs d
) v
) n
ON n.design_id = t.design_id
SET t.wishlisted = n.wishlisted
, t.rating = n.cote
, t.hits = n.hits
, t.rating_hits = n.rating_hits
, t.featured = n.featured
, t.tshirt_sales = n.sales
, t.sales_per_year = n.sales_per_year
只需执行该语句的一次执行,它将一举更新shop_tshirts
中的所有行。我们也可以为另一个表做类似的操作。
这就是我们改善表现的方式。
<强>后续强>
如果您不将UPDATE作为集处理,而是通过痛苦行(RBAR)处理行,那么请确保在shop_tshirts
上定义了合适的索引, shop_designs
个表格,其中design_id
为前导列。
答案 1 :(得分:0)
我的意思是这样的
<?php
$query = "SELECT * FROM shop_designs";
$res = mysql_query($query) or die(mysql_error());
while($row = mysql_fetch_array($res)) {
.
.
.
echo "Update D#$design.......";
$query += "UPDATE shop_designs SET rating='$cote', sales_per_year='$sales_per_year' WHERE design_id='$design';";
.
.
.
$query += "UPDATE shop_tshirts SET wishlisted='$wishlisted', rating='$cote', hits='$hits', rating_hits='$rating_hits',featured='$featured',tshirt_sales='$sales',sales_per_year='$sales_per_year' WHERE design_id='$design';";
}
mysql_query($query) or die(mysql_error());
在你之前的实现中,php将访问数据库连接超过6000次,但现在php只需要访问它两次...首先选择查询,然后再批量更新查询。
<强> [编辑] 强>
此外,$unan = 60 * 60 * 24 * 365;
$unjour = 60 * 60 * 24;
和time()
也可以放在循环之外。它不会产生明显的内存差异,但这种方式更为正确。