我正在为我们的网站制作一个销售排名小部件,它将显示按当前位置在“顶级图表”或“畅销书”列表中排序的产品(可以这么说)。
经过一些阅读后,看起来一个很好的实现方法就是滚动销售平均算法,销售越近,加权就越高。
示例:
$rolling_avg = ((4*$d1)+(3*$d2)+(2*$d3)+$d4+$d5+$d6+$d7)/13;
其中:
依旧......
目前,我正在尝试在大约500k记录的产品数据集上运行此操作,将计算出的排名插回到products表中,以便稍后查询。如果可能的话,我希望能够创建一个脚本来重新计算排名并在每12或24小时在cron上运行。
当前实施:
我当前的实现执行时间太长,我觉得需要在SQL级别完成更多的处理(SELECT查询少得多),但我不确定如何开始这个。
$products = mysql_query("SELECT * FROM products ORDER BY id DESC"); // <-- Est 450-500k rows.
while($product = mysql_fetch_array($products)) {
$product_id = $product['id'];
$d1 = mysql_query("SELECT COUNT(*) FROM orders WHERE (product_id = '$product_id') AND (sale_completed BETWEEN (NOW()-INTERVAL 24 HOUR) AND NOW())") or die(mysql_error);
$d1 = mysql_fetch_array($d1);
$d2 = mysql_query("SELECT COUNT(*) FROM orders WHERE (product_id = '$product_id') AND (sale_completed BETWEEN (NOW()-INTERVAL 48 HOUR) AND (NOW()-INTERVAL 24 HOUR))");
$d2 = mysql_fetch_array($d2);
$d3 = mysql_query("SELECT COUNT(*) FROM orders WHERE (product_id = '$product_id') AND (sale_completed BETWEEN (NOW()-INTERVAL 72 HOUR) AND (NOW()-INTERVAL 48 HOUR))");
$d3 = mysql_fetch_array($d3);
$d4 = mysql_query("SELECT COUNT(*) FROM orders WHERE (product_id = '$product_id') AND (sale_completed BETWEEN (NOW()-INTERVAL 96 HOUR) AND (NOW()-INTERVAL 72 HOUR))");
$d4 = mysql_fetch_array($d4);
$d5 = mysql_query("SELECT COUNT(*) FROM orders WHERE (product_id = '$product_id') AND (sale_completed BETWEEN (NOW()-INTERVAL 120 HOUR) AND (NOW()-INTERVAL 96 HOUR))");
$d5 = mysql_fetch_array($d5);
$d6 = mysql_query("SELECT COUNT(*) FROM orders WHERE (product_id = '$product_id') AND (sale_completed BETWEEN (NOW()-INTERVAL 144 HOUR) AND (NOW()-INTERVAL 120 HOUR))");
$d6 = mysql_fetch_array($d6);
$d7 = mysql_query("SELECT COUNT(*) FROM orders WHERE (product_id = '$product_id') AND (sale_completed BETWEEN (NOW()-INTERVAL 168 HOUR) AND (NOW()-INTERVAL 144 HOUR))");
$d7 = mysql_fetch_array($d7);
$d1 = $d1[0];
$d2 = $d2[0];
$d3 = $d3[0];
$d4 = $d4[0];
$d5 = $d5[0];
$d6 = $d6[0];
$d7 = $d7[0];
$rolling_avg = ((4*$d1)+(3*$d2)+(2*$d3)+$d4+$d5+$d6+$d7)/13;
mysql_query("UPDATE products SET rolling_sales = '$rolling_avg' WHERE id = '$product_id'");
}
不确定如何从这里优化/进步。但它肯定需要很多的工作。
在提到它之前,我理解mysql_*
函数已被折旧,我将在迁移到生产环境之前将其移至PDO。
答案 0 :(得分:0)
这是一个使用单个查询计算滚动销售额的功能。
function get_rolling_sales($product_id) {
$query = <<<EOF
SELECT (
4 * (
SELECT COUNT(*) FROM orders WHERE (product_id = $product_id)
AND (sale_completed BETWEEN (NOW()-INTERVAL 24 HOUR) AND NOW())
) + 3 * (
SELECT COUNT(*) FROM orders WHERE (product_id = $product_id)
AND (sale_completed BETWEEN (NOW()-INTERVAL 48 HOUR) AND (NOW()-INTERVAL 24 HOUR))
) + 2 * (
SELECT COUNT(*) FROM orders WHERE (product_id = $product_id)
AND (sale_completed BETWEEN (NOW()-INTERVAL 72 HOUR) AND (NOW()-INTERVAL 48 HOUR))
) + (
SELECT COUNT(*) FROM orders WHERE (product_id = $product_id)
AND (sale_completed BETWEEN (NOW()-INTERVAL 96 HOUR) AND (NOW()-INTERVAL 72 HOUR))
) + (
SELECT COUNT(*) FROM orders WHERE (product_id = $product_id)
AND (sale_completed BETWEEN (NOW()-INTERVAL 120 HOUR) AND (NOW()-INTERVAL 96 HOUR))
) + (
SELECT COUNT(*) FROM orders WHERE (product_id = $product_id)
AND (sale_completed BETWEEN (NOW()-INTERVAL 144 HOUR) AND (NOW()-INTERVAL 120 HOUR))
) + (
SELECT COUNT(*) FROM orders WHERE (product_id = $product_id)
AND (sale_completed BETWEEN (NOW()-INTERVAL 168 HOUR) AND (NOW()-INTERVAL 144 HOUR))
)
) / 13 AS rolling_sales
EOF;
$result = mysql_query($query);
$row = mysql_fetch_assoc($result);
return $row['rolling_sales'];
}
然而,迭代所有500.000个产品记录仍需要很长时间。您是否真的一次需要所有这些信息(例如用于计算),还是计划在分页表视图中显示?如果您只想显示数据,则可以按需计算rolling_sales。