如何找出已删除的行?

时间:2011-03-30 06:09:15

标签: sql informix auto-increment

使用SE引擎的Informix-SQL 7.32:

我有一位客户从SE表中删除了多行。 (我没有使用事务日志记录或审计)。该表有一个串行列。我想创建一个Ace报告来打印缺少的序列。我尝试了以下快速而肮脏的报告,但它没有用!你能建议一个更好的方法吗?

define
variable next_id integer
end

  select tbl_id 
    from tbl
order by tbl_id {I'm ordering tbl_id because all the rows are periodically re-clustered}
     end        {by an fk_id in order to group all rows belonging to the same customer}

format
on every row
let next_id = tbl_id + 1  

after group of tbl_id
if tbl_id + 1 <> next_id then
print column 1, tbl_id + 1 using "######"

end

或者创建一个包含INT列的临时表,其中包含1到5000的连续数字,并执行以下选择语句:

   SELECT tbl_id 
     FROM tbl
    WHERE tbl_id NOT IN
                 (SELECT tmp_int
                    FROM tmp);

或带有HAVING,OUTER等的选择语句

2 个答案:

答案 0 :(得分:1)

由于这是SE,我们必须使用老式表示法,而不是SQL-92 JOIN表示法。

以下四个查询是两个可能答案的共同基础:

SELECT t1.tbl_id AS tbl_id, t2.tbl_id AS ind
  FROM tbl AS t1, OUTER tbl AS t2
 WHERE t1.tbl_id + 1 = t2.tbl_id
  INTO TEMP x1;

SELECT t1.tbl_id AS tbl_id, t2.tbl_id AS ind
  FROM tbl AS t1, OUTER tbl AS t2
 WHERE t1.tbl_id - 1 = t2.tbl_id
  INTO TEMP x2;

SELECT tbl_id AS hi_range
  FROM x1
 WHERE ind IS NULL
  INTO TEMP x3;

SELECT tbl_id AS lo_range
  FROM x2
 WHERE ind IS NULL
  INTO TEMP x4;

表x3和x4现在分别包含(没有)tbl_id的值,这些值没有直接后继,也没有前一个前任。每个值是tbl_id值的连续范围的开始或结束。在IDS而不是SE中,您可以使用标准的SQL OUTER JOIN表示法,并在两个查询中过滤结果,而不是四个;在SE,你没有那么奢侈。

具有二次(或更差)行为的非解决方案

现在你只需要弄清楚如何组合这两个表:

SELECT t1.lo_range, t2.hi_range
  FROM x4 AS t1, x3 AS t2
 WHERE t1.lo_range <= t2.hi_range
   AND NOT EXISTS
       (SELECT t3.lo_range, t4.hi_range
          FROM x4 AS t3, x3 AS t4
         WHERE t3.lo_range <= t4.hi_range
           AND t1.lo_range =  t3.lo_range
           AND t2.hi_range >  t4.hi_range
       );

此查询的主要部分发生两次并生成所有行对,其中范围的起点小于或等于范围的结尾(等于允许'范围'由一个值组成,其自身,删除两边的行)。 NOT EXISTS子句确保没有其他对具有相同的起始值和较小的结束值。

如果数据中存在许多空白,则临时表上的查询可能不会非常快;如果差距很小,那么它们应该没问题。

最后一个查询在范围数方面表现出二次行为。当我只有十几个范围时,它很好(亚秒​​响应时间);当我有1200个范围时,它不合适 - 没有在合理的时间内完成。

避免二次行为

由于二次行为不好,我们如何改写查询......

对于范围的每个低端,找到大于或等于低端的范围的最小高端,或者在SQL中:

SELECT t1.lo_range, MIN(t2.hi_range) AS hi_range
  FROM x4 AS t1, x3 AS t2
 WHERE t2.hi_range >= t1.lo_range
 GROUP BY t1.lo_range;

请注意,这可以很容易地合并到ACE报告中。它为您提供了存在的数量范围 - 而不是那些不存在的范围。你可以弄清楚如何生成另一个。

时序

在包含数据中有1200个间隙的22100行的表格上表现相当不错。在其基准模式(-B)中使用(my)SQLCMD程序,并将SELECT输出发送到/ dev / null,并使用IDS 11.70.FC1在MacOS X 10.6.7(MacBook Pro,Intel Core 2 Duo,3 GHz和3 GHz)上运行4 GB RAM),结果如下:

$ sqlcmd -d stores -B -f gaps.sql
+ CLOCK START;
2011-03-31 18:44:39
+ BEGIN;
Time: 0.000588
2011-03-31 18:44:39
+ SELECT t1.tbl_id AS tbl_id, t2.tbl_id AS ind
  FROM tbl AS t1, OUTER tbl AS t2
 WHERE t1.tbl_id + 1 = t2.tbl_id
  INTO TEMP x1;
Time: 0.437521
2011-03-31 18:44:39
+ SELECT t1.tbl_id AS tbl_id, t2.tbl_id AS ind
   FROM tbl AS t1, OUTER tbl AS t2
  WHERE t1.tbl_id - 1 = t2.tbl_id
   INTO TEMP x2;
Time: 0.315050
2011-03-31 18:44:39
+ SELECT tbl_id AS hi_range
  FROM x1
 WHERE ind IS NULL
  INTO TEMP x3;
Time: 0.012510
2011-03-31 18:44:39
+ SELECT tbl_id AS lo_range
  FROM x2
 WHERE ind IS NULL
  INTO TEMP x4;
Time: 0.008754
+ output "/dev/null";
2011-03-31 18:44:39
+ SELECT t1.lo_range, MIN(t2.hi_range) AS hi_range
  FROM x4 AS t1, x3 AS t2
 WHERE t2.hi_range >= t1.lo_range
 GROUP BY t1.lo_range;
Time: 0.561935
+ output "/dev/stdout";
2011-03-31 18:44:40
+ SELECT COUNT(*) FROM x1;
22100
Time: 0.001171
2011-03-31 18:44:40
+ SELECT COUNT(*) FROM x2;
22100
Time: 0.000685
2011-03-31 18:44:40
+ SELECT COUNT(*) FROM x3;
1200
Time: 0.000590
2011-03-31 18:44:40
+ SELECT COUNT(*) FROM x4;
1200
Time: 0.000768
2011-03-31 18:44:40
+ SELECT t1.lo_range, MIN(t2.hi_range) AS hi_range
  FROM x4 AS t1, x3 AS t2
 WHERE t2.hi_range >= t1.lo_range
 GROUP BY t1.lo_range
 INTO TEMP x5;
Time: 0.529420
2011-03-31 18:44:40
+ SELECT COUNT(*) FROM x5;
1200
Time: 0.001155
2011-03-31 18:44:40
+ ROLLBACK;
Time: 0.329379
+ CLOCK STOP;
Time: 2.202523
$ 

它会做;处理时间不到几秒钟。

答案 1 :(得分:0)

请参阅:Is there an SQL function which generates a given range of sequential numbers?了解更简单,更具引擎效率的解决方案。