使用MySQL,我如何最好地查询~100,000行的表,以在序列号的型号中查找缺失值。例如,在时钟收音机(下面的型号#123)和计算器(下面的型号#345)中找到丢失的序列号。
数据如下所示:
MODEL#,SERIAL#
123,1
123,2
123,4
123,5
345,101
345,104
345,105
345,106
所需的输出将是
MODEL#,SERIAL#
123,3
345,102
345,103
请注意,每个型号的序列号都有不同的起始值。
谢谢!
答案 0 :(得分:0)
很难找到那些缺失的东西。弥补差距要容易得多。因此,这将返回第一个缺失的序列和缺少的数字:
select t.serial + 1 as FirstMissing,
(t.next_serial - t.serial) - 1 as numMissing
from (select t.*,
(select t2.serial
from data t2
where t2.model = t.model and t2.serial > t.serial
order by t2.serial asc
limit 1
) as next_serial
from data t
) t
where next_serial <> serial + 1;
答案 1 :(得分:0)
要确定该表中不存在某些内容,您可以LEFT JOIN
到该表并测试结果IS NULL
。例如,给定任何(模型,序列)对,您可以构造期望遵循给定的一对的(模型,序列)对。这是相同的型号,序号为1。然后LEFT JOIN
表自身,条件是(模型,序列)对等于预期的一对。对于任何缺失的序列号,自联接别名IS NULL
中的每一列,请添加WHERE
条件,仅保留这些行。
以下内容将帮助您入门。它标识了每个间隙的 start ,但不是间隙中的所有数字。
-- List all rows in table serials
-- 1. which aren't the highest for a given SKU, and
-- 2. for which the following serial number doesn't exist for the same SKU
SELECT ser1.sku, ser1.serial_num + 1 AS serial_num
FROM serials AS ser1
INNER JOIN (
SELECT sku, MAX(serial_num) AS serial_num
FROM serials
GROUP BY sku
) AS sermax
ON (ser1.sku = sermax.sku
AND ser1.serial_num < sermax.serial_num)
LEFT JOIN serials AS ser2
ON (ser1.sku = ser2.sku
AND ser1.serial_num + 1 = ser2.serial_num)
WHERE ser2.serial_num IS NULL
输出:
sku|serial_num
123| 3
345| 102
此答案中的代码为双重许可:CC BY-SA 3.0或MIT License as published by OSI。