我有以下SQL查询。
SELECT em.employeeid, tsi.timestamp
FROM timesheet_temp_import tsi
JOIN employee emp ON emp.employeeid = tsi.credentialnumber
WHERE
tsi.masterentity = 'MASTER' AND
tsi.timestamp NOT IN
(
SELECT ea.timestamp
FROM employee_attendance ea
WHERE
ea.employeeid = em.employeeid
AND ea.timestamp = tsi.timestamp
AND ea.ismanual = 0
)
GROUP BY em.employeeid, tsi.timestamp
此查询会比较导入表(包括员工时间和出勤时间戳)。
有时timesheet_temp_import
行的行数超过95,000,而且我的查询必须显示 为员工 new 的时间戳。如果员工已经存在时间戳,那么我必须将其排除。
查询正在运行,但是花了4分多钟,所以我想知道我是否可以改进NOT IN
语句,以帮助我减少这段时间。
答案 0 :(得分:6)
使用NOT EXISTS
可能对您有帮助。
SELECT
em.employeeid,
tsi.timestamp
FROM timesheet_temp_import tsi
join employee emp ON emp.employeeid = tsi.credentialnumber
WHERE
tsi.masterentity = 'MASTER' AND
NOT EXISTS
(
SELECT NULL
FROM employee_attendance ea
WHERE
ea.employeeid = em.employeeid
AND ea.timestamp = tsi.timestamp
AND ea.ismanual = 0
)
GROUP BY
em.employeeid,
tsi.timestamp
答案 1 :(得分:3)
您有此查询:
SELECT em.employeeid, tsi.timestamp
FROM timesheet_temp_import tsi JOIN
employee emp
ON emp.employeeid = tsi.credentialnumber
WHERE tsi.masterentity = 'MASTER' AND
tsi.timestamp NOT IN (SELECT ea.timestamp
FROM employee_attendance ea
WHERE ea.employeeid = em.employeeid AND
ea.timestamp = tsi.timestamp AND
ea.ismanual = 0
)
GROUP BY em.employeeid, tsi.timestamp;
在重写查询之前(而不是重新格式化它);我会检查索引和逻辑。 GROUP BY
是否必要?也就是说,外部查询是否存在重复?我猜不是,但我不知道你的数据。
其次,你想要索引。我认为以下索引:timesheet_temp_import(masterentity, credentialnumber, timestamp)
,employee(employeeid)
,employee_attendance(employeeid, timestamp, ismanual)
。
第三,我会问你是否有非员工的时间表。我想你可以摆脱外在的join
。所以,这可能是您想要的查询:
SELECT tsi.credentialnumber as employeeid, tsi.timestamp
FROM timesheet_temp_import tsi
WHERE tsi.masterentity = 'MASTER' AND
tsi.timestamp NOT IN (SELECT ea.timestamp
FROM employee_attendance ea
WHERE ea.employeeid = tsi.credentialnumber AND
ea.timestamp = tsi.timestamp AND
ea.ismanual = 0
);
将NOT IN
替换为NOT EXISTS
,您可能也会获得微不足道的改善。
答案 2 :(得分:2)
另一种方法是使用except
select whatever
from wherever
where somefield in
(select all potential values of that field
except
select the values you want to exlude)
这在逻辑上等同于not in
,但更快。
答案 3 :(得分:2)
尝试这个,我瘦你的意思是emp
SELECT distinct tsi.credentialnumber, tsi.timestamp
FROM timesheet_temp_import tsi
JOIN employee emp
ON emp.employeeid = tsi.credentialnumber
and tsi.masterentity = 'MASTER'
left join employee_attendance ea
on ea.employeeid = emp.employeeid
AND ea.timestamp = tsi.timestamp
AND ea.ismanual = 0
where ea.employeeid is null
取决于索引,这可能会更快
SELECT distinct tsi.credentialnumber, tsi.timestamp
FROM timesheet_temp_import tsi
JOIN employee emp
ON emp.employeeid = tsi.credentialnumber
and tsi.masterentity = 'MASTER'
left join employee_attendance ea
on ea.employeeid = tsi.credentialnumber
AND ea.timestamp = tsi.timestamp
AND ea.ismanual = 0
where ea.employeeid is null
答案 4 :(得分:1)
使用LEFT JOIN
和WHERE
子句代替NOT IN
进行过滤:
SELECT
em.employeeid,
tsi.timestamp
FROM timesheet_temp_import tsi
join employee emp ON emp.employeeid = tsi.credentialnumber
left join
(
SELECT ea.timestamp
FROM employee_attendance ea
WHERE
ea.employeeid = em.employeeid
AND ea.timestamp = tsi.timestamp
AND ea.ismanual = 0
) t on t.timestamp = tsi.timestamp
WHERE
tsi.masterentity = 'MASTER' AND
t.timestamp is null
GROUP BY
em.employeeid,
tsi.timestamp