我需要构建一个可以计算平均值和计数的查询,同时忽略标准差的异常值。
我在Mysql(P和A)中有两个具有以下属性的表:
P =付款:
Value_gbp
Paymentid
Account
rfx_ref
A =帐户:
Accountid
Entity_type
Settlment_model
rfx_ref
到目前为止,我已经得到了这个:
SELECT
Account,
COUNT(value_GBP) AS '# Of Payments',
TRUNCATE(AVG(value_GBP),2) As 'Avg Value'
FROM payments,
LEFT JOIN(
SELECT STDDEV(value_gbp) as std_gbp
FROM payments, accounts
WHERE payments.paymentid = accounts.acountid
AND Entity_type = 'company'
AND settlement_model = 'payment agent'
GROUP BY account
) outlier
On payments.paymentid = accounts.acountid
WHERE payments.value_gbp<=outlier.std_gbp*2
AND Entity_type = 'company'
AND settlement_model = 'payment agent'
GROUP BY account
但它正在说明:
On payments.paymentid = accounts.acountid
任何人都可以帮助我吗?
答案 0 :(得分:0)
子查询需要选择accounts.accountid
,然后您需要在JOIN
条件下使用它。
我也认为你对异常值的定义是错误的。它不应超过2个标准偏差,应该是平均值超过2个标准偏差的东西。因此子查询需要返回平均值和标准差,然后比较距离。
SELECT
account,
COUNT(value_GBP) AS '# Of Payments',
TRUNCATE(AVG(value_GBP),2) As 'Avg Value'
FROM payments
JOIN(
SELECT accountid, AVG(value_gpb) AS avg_gbp, STDDEV(value_gbp) as std_gbp
FROM payments, accounts
WHERE payments.paymentid = accounts.acountid
AND Entity_type = 'company'
AND settlement_model = 'payment agent'
GROUP BY accountid
) outlier
On payments.paymentid = outlier.accountid
JOIN accounts ON payments.paymentid = accounts.accountid
WHERE ABS(payments.value_gbp - outlier.avg_gpb) <= outlier.std_gbp*2
AND Entity_type = 'company'
AND settlement_model = 'payment agent'
GROUP BY account