更新

Question

我们有一个数据库，其表格的值是从另一个系统导入的。有一个自动增量列，没有重复值，但缺少值。例如，运行此查询：

select count(id) from arrc_vouchers where id between 1 and 100

应返回100，但它返回87。有没有我可以运行的查询将返回缺失数字的值？例如，id为1-70和83-100的记录可能存在，但没有id为71-82的记录。我想返回71,72,73等。

这可能吗？

Answer 1

更新

ConfexianMJS在性能方面提供了更好 answer。

（不尽快）回答

这是适用于任何大小的表格的版本（不仅仅是100行）：

SELECT (t1.id + 1) as gap_starts_at, 
       (SELECT MIN(t3.id) -1 FROM arrc_vouchers t3 WHERE t3.id > t1.id) as gap_ends_at
FROM arrc_vouchers t1
WHERE NOT EXISTS (SELECT t2.id FROM arrc_vouchers t2 WHERE t2.id = t1.id + 1)
HAVING gap_ends_at IS NOT NULL

gap_starts_at - 当前差距中的第一个ID
gap_ends_at - 当前差距中的最后一个ID

Answer 2

这对我来说只是找到了一个超过80k行的表中的空白：

SELECT
 CONCAT(z.expected, IF(z.got-1>z.expected, CONCAT(' thru ',z.got-1), '')) AS missing
FROM (
 SELECT
  @rownum:=@rownum+1 AS expected,
  IF(@rownum=YourCol, 0, @rownum:=YourCol) AS got
 FROM
  (SELECT @rownum:=0) AS a
  JOIN YourTable
  ORDER BY YourCol
 ) AS z
WHERE z.got!=0;

结果：

+------------------+
| missing          |
+------------------+
| 1 thru 99        |
| 666 thru 667     |
| 50000            |
| 66419 thru 66456 |
+------------------+
4 rows in set (0.06 sec)

请注意，列expected和got的顺序非常重要。

如果您知道YourCol未从1开始且无关紧要，您可以替换

(SELECT @rownum:=0) AS a

与

(SELECT @rownum:=(SELECT MIN(YourCol)-1 FROM YourTable)) AS a

新结果：

+------------------+
| missing          |
+------------------+
| 666 thru 667     |
| 50000            |
| 66419 thru 66456 |
+------------------+
3 rows in set (0.06 sec)

如果您需要对缺少的ID执行某种shell脚本任务，您也可以使用此变体来直接生成可以在bash中迭代的表达式。

SELECT GROUP_CONCAT(IF(z.got-1>z.expected, CONCAT('$(',z.expected,' ',z.got-1,')'), z.expected) SEPARATOR " ") AS missing
FROM (  SELECT   @rownum:=@rownum+1 AS expected,   IF(@rownum=height, 0, @rownum:=height) AS got  FROM   (SELECT @rownum:=0) AS a   JOIN block   ORDER BY height  ) AS z WHERE z.got!=0;

这会产生类似的输出

$(seq 1 99) $(seq 666 667) 50000 $(seq 66419 66456)

然后，您可以将其复制并粘贴到bash终端中的for循环中，以便为每个ID执行命令

for ID in $(seq 1 99) $(seq 666 667) 50000 $(seq 66419 66456); do
  echo $ID
  # fill the gaps
done

它与上述相同，只是它既可读又可执行。通过更改上面的“CONCAT”命令，可以为其他编程语言生成语法。或者甚至是SQL。

Answer 3

快速而肮脏的查询应该可以解决这个问题：

SELECT a AS id, b AS next_id, (b - a) -1 AS missing_inbetween
FROM 
 (
SELECT a1.id AS a , MIN(a2.id) AS b 
FROM arrc_vouchers  AS a1
LEFT JOIN arrc_vouchers AS a2 ON a2.id > a1.id
WHERE a1.id <= 100
GROUP BY a1.id
) AS tab

WHERE 
b > a + 1

这将为您提供一个表格，显示其上方缺少ID的ID，以及存在的next_id，以及...之间缺少的数量。

 
id  next_id  missing_inbetween
 1        4                  2
68       70                  1
75       87                 11

Answer 4

创建一个包含100行的临时表和包含值1-100的单个列。

将此表连接到您的arrc_vouchers表，并选择arrc_vouchers id为null的单列值。

编码这个盲人，但应该工作。

select tempid from temptable 
left join arrc_vouchers on temptable.tempid = arrc_vouchers.id 
where arrc_vouchers.id is null

Answer 5

需要查询+某些代码执行某些处理的替代解决方案是：

select l.id lValue, c.id cValue, r.id rValue 
  from 
  arrc_vouchers l 
  right join arrc_vouchers c on l.id=IF(c.id > 0, c.id-1, null)
  left  join arrc_vouchers r on r.id=c.id+1
where 1=1
  and c.id > 0 
  and (l.id is null or r.id is null)
order by c.id asc;

请注意，查询不包含任何我们知道MySQL计划程序无法正确处理的子选择。

这将返回每个centralValue（cValue）的一个条目，它没有较小的值（lValue）或更大的值（rValue），即：

lValue |cValue|rValue
-------+------+-------
{null} | 2    | 3      
8      | 9    | {null} 
{null} | 22   | 23     
23     | 24   | {null} 
{null} | 29   | {null} 
{null} | 33   | {null}

如果没有进一步的细节（我们将在下一段中看到它们），这个输出意味着：

0到2之间没有值
9到22之间没有值
24至29之间没有值
29到33之间没有值
33和MAX VALUE之间没有值

所以基本的想法是用相同的表做一个RIGHT和LEFT连接，看看我们是否有每个值的邻接值（即：如果中心值是'3'那么我们检查3-1 = 2左边和3右边+1，当ROW在RIGHT或LEFT处有NULL值时，我们知道没有相邻的值。

我的表的完整原始输出是：

select * from arrc_vouchers order by id asc;

0  
2  
3  
4  
5  
6  
7  
8  
9  
22 
23 
24 
29 
33

一些注意事项：

如果将“id”字段定义为UNSIGNED，则需要连接条件中的SQL IF语句，因此它不允许您将其降低到零以下。如果你保持c.value＆gt;这不是绝对必要的。 0正如下一个注释中所述，但我将其作为doc包含在内。
我正在过滤零中心值，因为我们对以前的任何值都不感兴趣，我们可以从下一行派生post值。

Answer 6

如果您使用的是MariaDB，则使用sequence storage engine可以获得更快（800％）的选项：

SELECT * FROM seq_1_to_50000 WHERE SEQ NOT IN (SELECT COL FROM TABLE);

Answer 7

基于Lucek上面给出的答案，这个存储过程允许您指定要测试的表和列名称以查找非连续记录 - 从而回答原始问题并演示如何使用@var来表示存储过程中的表和/或列。

create definer=`root`@`localhost` procedure `spfindnoncontiguous`(in `param_tbl` varchar(64), in `param_col` varchar(64))
language sql
not deterministic
contains sql
sql security definer
comment ''
begin
declare strsql varchar(1000);
declare tbl varchar(64);
declare col varchar(64);

set @tbl=cast(param_tbl as char character set utf8);
set @col=cast(param_col as char character set utf8);

set @strsql=concat("select 
    ( t1.",@col," + 1 ) as starts_at, 
  ( select min(t3.",@col,") -1 from ",@tbl," t3 where t3.",@col," > t1.",@col," ) as ends_at
    from ",@tbl," t1
        where not exists ( select t2.",@col," from ",@tbl," t2 where t2.",@col," = t1.",@col," + 1 )
        having ends_at is not null");

prepare stmt from @strsql;
execute stmt;
deallocate prepare stmt;
end

Answer 8

虽然这些似乎都有效但是当有50,000条记录时，结果集会在非常长的时间内返回。

我使用了它，它找到了间隙或下一个可用的（最后使用的+ 1），从查询返回得更快。

SELECT a.id as beforegap, a.id+1 as avail
FROM table_name a
where (select b.id from table_name b where b.id=a.id+1) is null
limit 1;

Answer 9

如果两个数字之间的间隔最大为一个间隔（例如 1,3,5,6），则可以使用的查询为：

select s.id+1 from source1 s where s.id+1 not in(select id from source1) and s.id+1<(select max(id) from source1);

表名-source1
column_name-id

Answer 10

我以不同的方式tried进行操作，而我发现的最佳性能是以下简单查询：

select a.id+1 gapIni
    ,(select x.id-1 from arrc_vouchers x where x.id>a.id+1 limit 1) gapEnd
    from arrc_vouchers a
    left join arrc_vouchers b on b.id=a.id+1
    where b.id is null
    order by 1
;

...左一个联接以检查下一个 id 是否存在，只有在未找到next的情况下，子查询才会找到存在的下一个ID以查找间隙的末尾。我这样做是因为使用等于（=）的查询要比大于（>）运算符的性能更好。 / p>

使用sqlfiddle不会显示其他查询的不同性能，但是在真实数据库中，该查询的执行速度比其他查询快3倍。

架构：

CREATE TABLE arrc_vouchers (id int primary key)
;
INSERT INTO `arrc_vouchers` (`id`) VALUES (1),(4),(5),(7),(8),(9),(10),(11),(15),(16),(17),(18),(19),(20),(21),(22),(23),(24),(25),(26),(27),(28),(29)
;

按照以下所有查询来比较性能：

select a.id+1 gapIni
    ,(select x.id-1 from arrc_vouchers x where x.id>a.id+1 limit 1) gapEnd
    from arrc_vouchers a
    left join arrc_vouchers b on b.id=a.id+1
    where b.id is null
    order by 1
;
select *, (gapEnd-gapIni) qt
    from (
        select id+1 gapIni
        ,(select x.id from arrc_vouchers x where x.id>a.id limit 1) gapEnd
        from arrc_vouchers a
        order by id
    ) a where gapEnd <> gapIni
;
select id+1 gapIni
    ,(select x.id from arrc_vouchers x where x.id>a.id limit 1) gapEnd
    #,coalesce((select id from arrc_vouchers x where x.id=a.id+1),(select x.id from arrc_vouchers x where x.id>a.id limit 1)) gapEnd
    from arrc_vouchers a
    where id+1 <> (select x.id from arrc_vouchers x where x.id>a.id limit 1)
    order by id
;
select id+1 gapIni
    ,coalesce((select id from arrc_vouchers x where x.id=a.id+1),(select x.id from arrc_vouchers x where x.id>a.id limit 1)) gapEnd
    from arrc_vouchers a
    order by id
;
select id+1 gapIni
    ,coalesce((select id from arrc_vouchers x where x.id=a.id+1),concat('*** GAT *** ',(select x.id from arrc_vouchers x where x.id>a.id limit 1))) gapEnd
    from arrc_vouchers a
    order by id
;

也许它可以帮助某人并且有用。

您可以使用以下sqlfiddle查看和测试我的查询：

http://sqlfiddle.com/#!9/6bdca7/1

Answer 11

可能不相关，但是我一直在寻找类似的东西，以数字顺序列出差距，并找到了这篇文章，根据您的实际需求提供了多种不同的解决方案。我一直在寻找序列中的第一个可用间隔（即下一个可用数字），这似乎很好。

SELECT MIN（l.number_sequence + 1）作为患者的下一个可用性，因为l.number_sequence + 1 = r.number_sequence，其中r.number_sequence为NULL。从2005年开始，这里讨论了其他几种方案和解决方案！

How to Find Missing Values in a Sequence With SQL

Answer 12

一个简单而有效的方法来寻找缺失的自增值

SELECT `id`+1 
FROM `table_name` 
WHERE `id`+1 NOT IN (SELECT id FROM table_name)

如何在mysql中找到序列编号的空白？

12 个答案:

更新

（不尽快）回答