我正在尝试过滤MySQL中的异常值。然而,在计算平均值时,异常值仍然存在。
例如,如果我收到6个订单,船舶费用分别为45,50,180,10,55和52,那么我预计180和10会从平均值下降......但它们不是。
这是我当前的查询:
SELECT
GROUP_CONCAT(o.orderid),
oi.productid,
AVG(o.actual_cost - (oi.wholesale_cost * oi.amount)) AS avg_ship_cost,
STDDEV(o.actual_cost - (oi.wholesale_cost * oi.amount)) AS std_dev
FROM
orders AS o,
order_items AS oi,
products AS p /* Needed to filter any deleted from products table since order placed. */
LEFT JOIN (
SELECT
GROUP_CONCAT(o.orderid),
oi.productid,
o.invoice_total - (oi.wholesale_cost * oi.amount) AS order_ship_cost,
AVG(o.invoice_total - (oi.wholesale_cost * oi.amount)) AS avg_ship_cost,
STDDEV(o.invoice_total - (oi.wholesale_cost * oi.amount)) AS std_dev
FROM
orders AS o_lj,
order_items AS oi_lj
CROSS JOIN (
SELECT
AVG(o_a.invoice_total - (oi_a.wholesale_cost * oi_a.amount)) AS mean,
STDDEV(o_a.invoice_total - (oi_a.wholesale_cost * oi_a.amount)) AS dev
FROM
orders AS o_a,
order_items AS oi_a
WHERE
o_a.orderid = oi_a.orderid AND
oi_a.productid = p_a.productid AND
o_a.date > UNIX_TIMESTAMP(NOW() - INTERVAL 90 DAY) AND
oi_a.amount = 1 AND
o_a.invoice_total > 0
) a
WHERE
o_lj.orderid = oi_lj.orderid AND
o_lj.date > UNIX_TIMESTAMP(NOW() - INTERVAL 90 DAY) AND
oi_lj.amount = 1 AND
o_lj.invoice_total > 0 AND
ABS(o_lj.invoice_total - (oi_lj.wholesale_cost * oi_lj.amount) - a.mean) / a.dev > 1
GROUP BY
oi_lj.productid
) lj
ON lj.productid = oi.productid
WHERE
o.orderid = oi.orderid AND
p.productid = oi.productid AND
o.date > UNIX_TIMESTAMP(NOW() - INTERVAL 90 DAY) AND
oi.amount = 1 AND
o.invoice_total > 0 AND
lj.productid IS NULL
GROUP BY
oi.productid
更改允许的平均偏差数并不能解决问题。
为什么他的查询有效,而我的查询没有?