基于标识符在2个间隔之间隔离数据

时间:2012-03-07 18:33:09

标签: sql sql-server tsql

我的一个朋友给了我这个Q,我太困惑了。

他的团队正在加载DW,并且数据在基本的adhoc上继续以渐进和满载的方式进行。现在有标识符标志说 至于满负荷何时开始或停止。现在我们需要收集然后隔离所有满载。

例如:

create table #tmp (
  id int identity(1,1) not null,
  name varchar(30) null,
  val int null
)

insert into #tmp (name, val) select 'detroit', 3
insert into #tmp (name, val) select 'california', 9
insert into #tmp (name, val) select 'houston', 1
insert into #tmp (name, val) select 'los angeles', 4
insert into #tmp (name, val) select 'newyork', 8
insert into #tmp (name, val) select 'chicago', 1
insert into #tmp (name, val) select 'seattle', 9
insert into #tmp (name, val) select 'michigan', 6
insert into #tmp (name, val) select 'atlanta', 9
insert into #tmp (name, val) select 'philly', 6
insert into #tmp (name, val) select 'brooklyn', 8

drop table #tmp

规则是:

当val为9时,满负荷开始;每当val为8,就满了  装载停止; (或当下一个val为8时,满载停止)。

在这种情况下,对于满载,我应该只收集这些记录:

  


id name val
      
3休斯顿1       
4洛杉矶4       
10 philly 6

到目前为止我的方法:

;with mycte as (
    select id, name, val, row_number() over (order by id) as rnkst 
    from #tmp
    where val in (8,9))
SELECT *
FROM mycte y
WHERE val = 9
    AND Exists (
        SELECT * 
        FROM mycte x 
        WHERE x.id = 
                      ----> this gives start 9 record but not stop record of 8
                      (SELECT MIN(id)    
                      FROM mycte z 
                      WHERE z.id > y.id)
            AND val = 8)

我不想在光标方法中冒险进入光标但是有了CTE,请指教!

  

更新:   
正如其中一位回答者所说,我正在重申规则。   
- >满载记录在9号之后开始。(不包括第9条记录)   
- >满载继续,直到它立即看到8。   
- >因此,有效地记录9和8之间的所有记录形成小块满载   
- >单独的第9条记录本身并未被考虑,因为它没有8作为合作伙伴   
- >下面显示的结果集满足这些条件

2 个答案:

答案 0 :(得分:1)

我不确定我的英语能否让我完全解释我的方法,但我会尝试,以防它可以提供帮助。

  1. 对所有行进行排名并分别对界限(val IN (8, 9))进行排名。

  2. val = 8的子集与val = 9的子集连接起来,条件是前者的绑定排名应该比后者的排序大1(一)。

  3. 将非(8, 9)行的子集加入到步骤2的结果集中,条件是(一般)排名应该在val = 9子集的排名和val = 8一个。

  4. 以下是用于说明我对口头描述的尝试的查询:

    WITH ranked AS (
      SELECT
        *,
        rnk       = ROW_NUMBER() OVER (ORDER BY id),
        bound_rnk = ROW_NUMBER() OVER (
          PARTITION BY CASE WHEN val IN (8, 9) THEN 1 ELSE 2 END
          ORDER BY id
        )
      FROM #tmp
    )
    SELECT
      load.id,
      load.name,
      load.val
    FROM       ranked AS eight
    INNER JOIN ranked AS nine ON eight.bound_rnk = nine.bound_rnk + 1
    INNER JOIN ranked AS load ON load.rnk BETWEEN nine.rnk AND eight.rnk
    WHERE eight.val = 8
      AND nine .val = 9
      AND load .val NOT IN (8, 9)
    ;
    

    你可能不相信我,但是当我测试它时,它确实返回了以下内容:

    id name        val 
    -- ----------- --- 
    3  houston     1   
    4  los angeles 4   
    10 philly      6   
    

答案 1 :(得分:0)

我不相信有一种方法可以在没有while循环或可能是复杂的递归cte的情况下执行此操作。所以,我的问题是,如果在代码中完全可以实现这一点? SQL不像过程语言那么强大,因此代码可以更好地处理它。如果这不是一个选项,那么我会使用while循环(比光标更好)。我将很快为此创建SQL。

/*
drop table #tmp
drop table #finalTmp
drop table #startStop
*/

  create table #tmp (
  id int identity(1,1) not null,
  name varchar(30) null,
  val int null
)

insert into #tmp (name, val) select 'detroit', 3
insert into #tmp (name, val) select 'california', 9
insert into #tmp (name, val) select 'houston', 1
insert into #tmp (name, val) select 'los angeles', 4
insert into #tmp (name, val) select 'newyork', 8
insert into #tmp (name, val) select 'chicago', 1
insert into #tmp (name, val) select 'seattle', 9
insert into #tmp (name, val) select 'michigan', 6
insert into #tmp (name, val) select 'atlanta', 9
insert into #tmp (name, val) select 'philly', 6
insert into #tmp (name, val) select 'brooklyn', 8

CREATE TABLE #Finaltmp
    (
        id INT,
        name VARCHAR(30),
        val INT
    )

    SELECT id, val, 0 AS Checked
    INTO #StartStop
    FROM #tmp
    WHERE val IN (8,9)

    DECLARE @StartId INT, @StopId INT
    WHILE EXISTS (SELECT 1 FROM #StartStop WHERE Checked = 0)
    BEGIN
        SELECT TOP 1 @StopId = id
        FROM #StartStop
        WHERE EXISTS 
            --This makes sure we grab a stop that has a start before it
            (
                SELECT 1
                FROM #StartStop AS TestCheck
                WHERE TestCheck.id < #StartStop.id AND val = 9
            )
        AND Checked = 0 AND val = 8
        ORDER BY id

        --If no more starts, then the rest are stops
        IF @StopId IS NULL
            BREAK

        SELECT TOP 1 @StartId = id
        FROM #StartStop
        WHERE Checked = 0 AND val = 9 
            --Make sure we only pick up the 9 that matches
            AND Id < @StopId
        ORDER BY Id DESC

        IF @StartId IS NULL
            BREAK

        INSERT INTO #Finaltmp
        SELECT * 
        FROM #tmp
        WHERE id BETWEEN @StartId AND @StopId
            AND val NOT IN (8,9)

        --Make sure to "check" any values that fell in the middle (double 9's)
        --If not, then you would start picking up overlap data
        UPDATE #StartStop
        SET Checked = 1
        WHERE id <= @StopId
    END

    SELECT * FROM #Finaltmp

我注意到数据看起来有点不稳定,所以我试图对它们进行一些边缘案例检查和评论