在一个查询中查找文件名列表中缺少的数字

时间:2015-10-19 19:34:32

标签: sql-server numbers sequence

在文档管理数据库中,有一个现有文件列表。新文件应该有一个数字,例如 SW_01234.xxx 。有一种编号机制可以为新数字提供服务。问题是找到缺少的元素 - 例如,如果文件被删除。

现有文件名可能完全不同,不遵循上述方案。

我的尝试是这样做的:

  1. 在" dot"拆分现有文件 - 我不在乎.xxx扩展名,例如.doc,.xlsx

  2. 生成SW_00000到SW_99999

  3. 的临时列表
  4. 带来b)中存在的那些元素但不是a)

  5. 示例值

    ..  
    SW_00015.PRT  
    SW_00016.DRW  
    SW_00020.DRW  
    SW_00020.PDF  
    XBC115.DOC  
    ..  
    

    我需要获得SW_00017,SW_00018,SW_00019(不要关心XBC)

    最后需要一个查询

2 个答案:

答案 0 :(得分:0)

这应该让你99%的方式。根据需要调整。您会注意到输出中缺少记录1,3,4和10。

DECLARE @allFiles TABLE (Name VARCHAR(100));
DECLARE @i INT = 0;
DECLARE @dataset TABLE (name VARCHAR(100));
INSERT INTO @dataset
        ( name )
VALUES  ( 'SW_00001.PRT'), ('SW_00003.DRW'), ('SW_00004.DRW'), ('SW_00010.PDF'
          );

WHILE @i < 100
BEGIN
    INSERT INTO @allFiles
            ( Name )
    VALUES  ( 'SW_' +  REPLICATE('0',5-LEN(@i)) + CAST(@i AS VARCHAR(10))  -- Name - varchar(100)
              );

    SET @i = @i + 1;
END;

SELECT *
FROM @allFiles af
WHERE NOT EXISTS (SELECT TOP 1 1 FROM @dataset ds WHERE af.Name = SUBSTRING(ds.name, 0, CHARINDEX('.', ds.name)))

答案 1 :(得分:0)

我尝试在一个查询中实现您的方法。为了在一个查询中完成所有操作,我使用CTE来隔离文档编号,并获得要在&#34;不存在&#34;中使用的数字范围。部分。如果您需要更多数字表中的范围,您可以通过不同的查询来获得范围。见Generate Sequential Set of Numbers

declare @t as table (DocName varchar(50));

insert @t (DocName)
values 
('SW_00015.PRT')
,('SW_00016.DRW')
,('SW_00020.DRW')
,('SW_00020.PDF');


/*doing with CTE so the split and substring is more readable, plus needed it anyway for getting the numbers table*/
with isolatedFileNames
as (
    /*might be dots in filename, reversing it to isolate the last set (file ext)*/
    select DocName
        ,left(DocName, len(DocName) - charindex('.', reverse(DocName), 0)) as IsolatedDocName 
    from @t
    )
    ,isolatedNumbers
as (
    /*substring to get the number without the prefix*/
    select DocName
        ,IsolatedDocName
        ,cast(substring(IsolatedDocName, charindex('_', IsolatedDocName, 0) + 1, len(IsolatedDocName)) as int) as IsolatedDocNumber
    from isolatedFileNames
    )
    ,numbers
as (
    /*use row_number on a large set to get the range*/
    select ROW_NUMBER() over (
            order by object_id
            ) + (
            /*start at the first document number, change this to 0 if you want to start at 0*/
            select min(IsolatedDocNumber) - 1
            from isolatedNumbers
            ) as num
    from sys.all_objects
    )
    ,numbersLessThanDocNumbers
as (
    select num
    from numbers
    where num < (
            /*limit to max document number in the set*/
            select max(IsolatedDocNumber)
            from isolatedNumbers
            )
    )
select num as MissingFromDocumentSet
from numbersLessThanDocNumbers n
where not exists (
        select 1
        from isolatedNumbers iso
        where iso.IsolatedDocNumber = n.num
        )