使用SE引擎的Informix-SQL 7.32:
我有一位客户从SE表中删除了多行。 (我没有使用事务日志记录或审计)。该表有一个串行列。我想创建一个Ace报告来打印缺少的序列。我尝试了以下快速而肮脏的报告,但它没有用!你能建议一个更好的方法吗?
define
variable next_id integer
end
select tbl_id
from tbl
order by tbl_id {I'm ordering tbl_id because all the rows are periodically re-clustered}
end {by an fk_id in order to group all rows belonging to the same customer}
format
on every row
let next_id = tbl_id + 1
after group of tbl_id
if tbl_id + 1 <> next_id then
print column 1, tbl_id + 1 using "######"
end
或者创建一个包含INT列的临时表,其中包含1到5000的连续数字,并执行以下选择语句:
SELECT tbl_id
FROM tbl
WHERE tbl_id NOT IN
(SELECT tmp_int
FROM tmp);
或带有HAVING,OUTER等的选择语句
答案 0 :(得分:1)
由于这是SE,我们必须使用老式表示法,而不是SQL-92 JOIN表示法。
以下四个查询是两个可能答案的共同基础:
SELECT t1.tbl_id AS tbl_id, t2.tbl_id AS ind
FROM tbl AS t1, OUTER tbl AS t2
WHERE t1.tbl_id + 1 = t2.tbl_id
INTO TEMP x1;
SELECT t1.tbl_id AS tbl_id, t2.tbl_id AS ind
FROM tbl AS t1, OUTER tbl AS t2
WHERE t1.tbl_id - 1 = t2.tbl_id
INTO TEMP x2;
SELECT tbl_id AS hi_range
FROM x1
WHERE ind IS NULL
INTO TEMP x3;
SELECT tbl_id AS lo_range
FROM x2
WHERE ind IS NULL
INTO TEMP x4;
表x3和x4现在分别包含(没有)tbl_id的值,这些值没有直接后继,也没有前一个前任。每个值是tbl_id值的连续范围的开始或结束。在IDS而不是SE中,您可以使用标准的SQL OUTER JOIN表示法,并在两个查询中过滤结果,而不是四个;在SE,你没有那么奢侈。
现在你只需要弄清楚如何组合这两个表:
SELECT t1.lo_range, t2.hi_range
FROM x4 AS t1, x3 AS t2
WHERE t1.lo_range <= t2.hi_range
AND NOT EXISTS
(SELECT t3.lo_range, t4.hi_range
FROM x4 AS t3, x3 AS t4
WHERE t3.lo_range <= t4.hi_range
AND t1.lo_range = t3.lo_range
AND t2.hi_range > t4.hi_range
);
此查询的主要部分发生两次并生成所有行对,其中范围的起点小于或等于范围的结尾(等于允许'范围'由一个值组成,其自身,删除两边的行)。 NOT EXISTS子句确保没有其他对具有相同的起始值和较小的结束值。
如果数据中存在许多空白,则临时表上的查询可能不会非常快;如果差距很小,那么它们应该没问题。
最后一个查询在范围数方面表现出二次行为。当我只有十几个范围时,它很好(亚秒响应时间);当我有1200个范围时,它不合适 - 没有在合理的时间内完成。
由于二次行为不好,我们如何改写查询......
对于范围的每个低端,找到大于或等于低端的范围的最小高端,或者在SQL中:
SELECT t1.lo_range, MIN(t2.hi_range) AS hi_range
FROM x4 AS t1, x3 AS t2
WHERE t2.hi_range >= t1.lo_range
GROUP BY t1.lo_range;
请注意,这可以很容易地合并到ACE报告中。它为您提供了存在的数量范围 - 而不是那些不存在的范围。你可以弄清楚如何生成另一个。
在包含数据中有1200个间隙的22100行的表格上表现相当不错。在其基准模式(-B)中使用(my)SQLCMD程序,并将SELECT输出发送到/ dev / null,并使用IDS 11.70.FC1在MacOS X 10.6.7(MacBook Pro,Intel Core 2 Duo,3 GHz和3 GHz)上运行4 GB RAM),结果如下:
$ sqlcmd -d stores -B -f gaps.sql
+ CLOCK START;
2011-03-31 18:44:39
+ BEGIN;
Time: 0.000588
2011-03-31 18:44:39
+ SELECT t1.tbl_id AS tbl_id, t2.tbl_id AS ind
FROM tbl AS t1, OUTER tbl AS t2
WHERE t1.tbl_id + 1 = t2.tbl_id
INTO TEMP x1;
Time: 0.437521
2011-03-31 18:44:39
+ SELECT t1.tbl_id AS tbl_id, t2.tbl_id AS ind
FROM tbl AS t1, OUTER tbl AS t2
WHERE t1.tbl_id - 1 = t2.tbl_id
INTO TEMP x2;
Time: 0.315050
2011-03-31 18:44:39
+ SELECT tbl_id AS hi_range
FROM x1
WHERE ind IS NULL
INTO TEMP x3;
Time: 0.012510
2011-03-31 18:44:39
+ SELECT tbl_id AS lo_range
FROM x2
WHERE ind IS NULL
INTO TEMP x4;
Time: 0.008754
+ output "/dev/null";
2011-03-31 18:44:39
+ SELECT t1.lo_range, MIN(t2.hi_range) AS hi_range
FROM x4 AS t1, x3 AS t2
WHERE t2.hi_range >= t1.lo_range
GROUP BY t1.lo_range;
Time: 0.561935
+ output "/dev/stdout";
2011-03-31 18:44:40
+ SELECT COUNT(*) FROM x1;
22100
Time: 0.001171
2011-03-31 18:44:40
+ SELECT COUNT(*) FROM x2;
22100
Time: 0.000685
2011-03-31 18:44:40
+ SELECT COUNT(*) FROM x3;
1200
Time: 0.000590
2011-03-31 18:44:40
+ SELECT COUNT(*) FROM x4;
1200
Time: 0.000768
2011-03-31 18:44:40
+ SELECT t1.lo_range, MIN(t2.hi_range) AS hi_range
FROM x4 AS t1, x3 AS t2
WHERE t2.hi_range >= t1.lo_range
GROUP BY t1.lo_range
INTO TEMP x5;
Time: 0.529420
2011-03-31 18:44:40
+ SELECT COUNT(*) FROM x5;
1200
Time: 0.001155
2011-03-31 18:44:40
+ ROLLBACK;
Time: 0.329379
+ CLOCK STOP;
Time: 2.202523
$
它会做;处理时间不到几秒钟。
答案 1 :(得分:0)
请参阅:Is there an SQL function which generates a given range of sequential numbers?了解更简单,更具引擎效率的解决方案。