我正在操作一个长的,非常查询的表(> 5亿条目),因此避免大量查询非常重要。
目前我需要获取一些带有条件的值(稍后会更好地解释),然后检查这些值是否在另一组值中(所有这些都指向相同的字段)。我正在使用with
创建表格视图。
所以,这是表格语法:( table employee)
+--------+-------------+-----------+--------+---------+-----------+
| period | employee_id | operation | sub_op | payment | work_zone |
+--------+-------------+-----------+--------+---------+-----------+
期间的格式为“YYMM”,一个期间指一个月。
当然,这个表比这个样本要长得多,但我只需要查询中的那些字段。简要说明我需要什么,然后是查询本身。
我需要获得当前employee_id
中的所有period
,其中payment
(至少250美元)和特定operation
(第一个I组operations
1}} sub_op
值)。询问的operation
值为97,在查询中,您将看到我如何对其进行分组。
现在,根据这些值,我将其按work_zone
和分组operation
值进行分组。现在子查询开始......我需要:
所以,这是我到目前为止的查询。 (我用的是'1109'期间)
CREATE OR REPLACE VIEW hired_fired AS
WITH query_hired_fired AS (
SELECT work_zone, operation, sub_op, employee_id,
CASE
WHEN operation = 97 THEN
CASE
WHEN sub_op IN (1,3,5) THEN 'Cookers'
WHEN sub_op IN (2,6) THEN 'Waitress'
WHEN sub_op IN (4,7,8,9,10) THEN 'Cashier'
WHEN sub_op = 11 THEN 'Security'
WHEN sub_op IN (12,13) THEN 'Cleaners'
ELSE 'Others'
END
END AS opgroup
FROM employee
WHERE period = 1109 AND payment >= 250 AND operation = 97
)
SELECT 201109 AS periodo, opgroup, work_zone
(SELECT COUNT(DISTINCT employee_id) FROM query_hired_fired WHERE employee_id NOT IN (SELECT employee_id FROM employee WHERE period = 1108 AND payment >= 250 AND operation = 97)) AS total,
(SELECT COUNT(DISTINCT employee_id) FROM query_hired_fired WHERE employee_id NOT IN (SELECT employee_id FROM employee WHERE period BETWEEN 0808 AND 1108 AND payment >= 250 AND operation = 97)) AS absolut,
(SELECT COUNT(DISTINCT employee_id) FROM query_hired_fired WHERE employee_id IN (SELECT employee_id FROM employee WHERE period BETWEEN 0808 AND 1108 AND payment >= 250 AND operation = 97)) AS reincorporated,
(SELECT COUNT(DISTINCT employee_id) FROM query_hired_fired WHERE employee_id IN (SELECT employee_id FROM employee WHERE period BETWEEN 0808 AND 1108 AND payment >= 250 AND operation != 97)) AS operation_change,
(SELECT COUNT(DISTINCT employee_id) FROM query_hired_fired WHERE employee_id IN (SELECT employee_id FROM employee WHERE period BETWEEN 0808 AND 1108 AND payment < 250 AND operation = 97)) AS raised,
FROM query_hired_fired
GROUP BY work_zone, opgroup
所以,我的问题是......无论如何我可以在没有所有子查询的情况下执行此查询吗?我认为这需要几个小时的时间才能运行,而且这不符合使用此表的可能性。
很抱歉,如果我对某些事情一直不清楚,我会尽快回答所有的问题和怀疑。感谢。
答案 0 :(得分:1)
尝试此查询:
WITH query_hired_fired AS (
SELECT work_zone, operation, sub_op, employee_id,
CASE
WHEN operation = 97 THEN
CASE
WHEN sub_op IN (1,3,5) THEN 'Cookers'
WHEN sub_op IN (2,6) THEN 'Waitress'
WHEN sub_op IN (4,7,8,9,10) THEN 'Cashier'
WHEN sub_op = 11 THEN 'Security'
WHEN sub_op IN (12,13) THEN 'Cleaners'
ELSE 'Others'
END
END AS opgroup
FROM employee
)
SELECT opgroup, work_zone,
SUM( x_period_1109 * x_total ) As total,
SUM( x_period_1109 * x_absolut ) As absolut,
SUM( x_period_1109 * x_reincorporated ) As reincorporated,
SUM( x_period_1109 * x_operation_change ) As operation_change,
SUM( x_period_1109 * x_raised ) As raised
FROM (
SELECT opgroup, work_zone, employee_id,
MAX( CASE WHEN period = 1108 AND payment >= 250 AND operation = 97 THEN 1 ELSE 0 END) as x_total,
MAX( CASE WHEN period = 1108 AND payment >= 250 AND operation = 97 THEN 1 ELSE 0 END ) as x_absolut,
MAX( CASE WHEN period BETWEEN 0808 AND 1108 AND payment >= 250 AND operation = 97 THEN 1 ELSE 0 END ) as x_reincorporated,
MAX( CASE WHEN period BETWEEN 0808 AND 1108 AND payment >= 250 AND operation != 97 THEN 1 ELSE 0 END ) as x_operation_change,
MAX( CASE WHEN period BETWEEN 0808 AND 1108 AND payment < 250 AND operation = 97 THEN 1 ELSE 0 END ) as x_raised,
MAX( CASE WHEN period = '1109' AND payment >= 250 AND operation = 97 THEN 1 ELSE 0 END ) As x_period_1109
FROM query_hired_fired
WHERE period BETWEEN 0808 AND 1109
GROUP BY opgroup, work_zone, employee_id
) x
GROUP BY work_zone, opgroup
您的查询中的这种情况:BETWEEN 1108 AND 0808
始终评估为false,
我认为它应该是:BETWEEN 0808 AND 1108
答案 1 :(得分:1)
我和Kordirko有点相似,但却融为一体。内部“PreCalc”查询的前提是,如果满足条件,则每个员工计算一行,标志为1或0。由于您的所有条件都基于范围或者只是1108 OR(在0808和1108之间),因此该子查询只能获得0808和1108之间的所有记录,因此它将简化复杂情况/条件时的可读性。我应用它的唯一条件是你专门寻找确切前期的第一个条件。也就是说,其余的项目是付款金额的限定符,并且是(或不是)操作97.因此对于任何员工,标志将分别设置为1或0。
现在,它将应用于执行SUM / CASE的外部查询。考虑到你的“NOT IN”,我正在寻找给定的flag = 0(因此不符合基础数据)vs flag = 1它DID符合基础数据。
由于预查询也计算了“opgroup”,所以它完全包裹起来。
我会确保yourtable有一个索引 (期间,employee_id,work_zone)帮助优化。您可以进一步使用索引键使其成为覆盖索引,但请先了解它是如何工作的。
SELECT
201109 AS periodo,
work_zone,
opgroup,
SUM( case when PreCalc.LPOver250 == 0 end ) as EmpsNotInLastPeriodOver250,
SUM( case when PreCalc.Over250Op97 == 0 end ) as EmpsNotInOver250Per97,
SUM( case when PreCalc.Over250Op97 == 1 end ) as EmpsInOver250Per97,
SUM( case when PreCalc.Over250NotOp97 == 1 end ) as EmpsOver250NotInOp97,
SUM( case when PreCalc.Under250 == 1 end ) as EmpsUnder250
from
( SELECT
Employee_ID,
work_zone,
CASE WHEN operation = 97 THEN
CASE WHEN sub_op IN (1,3,5) THEN 'Cookers'
WHEN sub_op IN (2,6) THEN 'Waitress'
WHEN sub_op IN (4,7,8,9,10) THEN 'Cashier'
WHEN sub_op = 11 THEN 'Security'
WHEN sub_op IN (12,13) THEN 'Cleaners'
ELSE 'Others'
END
END AS opgroup,
MAX( case when period = 1108
and payment >= 250
and operation = 97 then 1 else 0 end ) as LPOver250,
MAX( case when payment >= 250
and operation = 97 then 1 else 0 end ) as Over250Op97,
MAX( case when payment >= 250
and operation != 97 then 1 else 0 end ) as Over250NotOp97,
MAX( case when payment < 250
and operation = 97 then 1 else 0 end ) as Under250
from
employee
where
period between 0808 and 1108
group by
Employee_ID,
work_zone,
opgroup ) PreCalc
group by
work_zone,
opgroup