过滤SQL Server 2005中的错误数据

时间:2010-06-24 17:07:03

标签: sql sql-server sql-server-2005

使用SQL Server 2005

我有一个包含以下列的表

ID 名称 日期 值

我想从表中选择按日期没有四个连续零的所有行。我该怎么办?以下是我的意思的一个例子。

id      name     date         value
1       a        1/1/2010     5
2       a        1/2/2010     3
3       a        1/3/2010     5
4       a        1/4/2010     0
5       a        1/7/2010     0
6       a        1/8/2010     0
7       a        1/9/2010     2
8       a        1/10/2010    3
9       a        1/11/2010    0
10      a        1/15/2010    0
11      a        1/16/2010    0
12      a        1/17/2010    0
13      a        1/20/2010    4
14      a        1/21/2010    4

我希望查询结果包含除id 9-12之外的所有行。

3 个答案:

答案 0 :(得分:2)

这是假设您按ID排序行,但您只需将ORDER BY id更改为其他内容,它仍应有效。

使用this Kodyaz Development Resources site上的T-SQL CTE,我能够创建以下代码。我有它的工作,所以删除有两个连续零的行,而不是4,因为我在我的代码上测试它,只是更改了表/行名称。

WITH CTE as (
  SELECT
    RN = ROW_NUMBER() OVER (ORDER BY id),
    *
  FROM tablename
)
SELECT
  [Current Row].*
FROM CTE [Current Row]
LEFT JOIN CTE [Previous Row] ON
  [Previous Row].RN = [Current Row].RN - 1
LEFT JOIN CTE [Next Row] ON
  [Next Row].RN = [Current Row].RN + 1
WHERE
  not([Current Row].value = 0 AND [Next Row].value = 0) AND  
     // this deletes the row where value is zero and the next rows value is zero
  not([Previous Row].value = 0 AND [Current Row].value = 0) 
     // this deletes the row where value is zero and the previous rows value is zero

要使其适用于您的案例,您需要做的就是将每个可能的组合放在WHERE语句中。例如,处理此行和3个下一行等于0或者此行是前一行和后两行。

答案 1 :(得分:1)

你没有提到名字是如何涉及的,所以我假设你想通过名字来完成这个。我将进一步假设当你谈到你在日期顺序中的“连续”时,而不是在id顺序中。最后,我还假设您还可以连续排除5个零,连续排除6个零等。

可能有一种更简单的方法,但这应该有效:

;WITH Transitions_To_CTE AS
(
    SELECT
        T1.id,
        T1.name,
        T1.date,
        T1.value
    FROM
        My_Table T1
    LEFT OUTER JOIN My_Table T2 ON
        T2.name = T1.name AND
        T2.date < T1.date AND
        T2.value <> 0
    LEFT OUTER JOIN My_Table T3 ON
        T3.name = T1.name AND
        T3.date > COALESCE(T2.date, '1900-01-01') AND
        T3.date < T1.date
    WHERE
        T1.value = 0 AND
        T3.id IS NULL
),
Transitions_From_CTE AS
(
    SELECT
        T1.id,
        T1.name,
        T1.date,
        T1.value
    FROM
        My_Table T1
    LEFT OUTER JOIN My_Table T2 ON
        T2.name = T1.name AND
        T2.date > T1.date AND
        T2.value <> 0
    LEFT OUTER JOIN My_Table T3 ON
        T3.name = T1.name AND
        T3.date < COALESCE(T2.date, '9999-12-31') AND
        T3.date > T1.date
    WHERE
        T1.value = 0 AND
        T3.id IS NULL
),
Range_Exclusions AS
(
    SELECT
        S.name,
        S.date AS start_date,
        E.date AS end_date
    FROM
        Transitions_To_CTE S
    INNER JOIN Transitions_From_CTE E ON
        E.name = S.name AND
        E.date > S.date
    LEFT OUTER JOIN Transitions_From_CTE E2 ON
        E2.name = S.name AND
        E2.date > S.date AND
        E2.date < E.date
    WHERE
        E2.id IS NULL AND
        (SELECT COUNT(*) FROM dbo.My_Table T WHERE T.name = S.name AND T.date BETWEEN S.date AND E.date) >= 4
)
SELECT
    T.id,
    T.name,
    T.date,
    T.value
FROM
    dbo.My_Table T
WHERE
    NOT EXISTS (SELECT * FROM Range_Exclusions RE WHERE RE.name = T.name AND T.date BETWEEN RE.start_date AND RE.end_date)

答案 2 :(得分:0)

这是我的尝试,使用递归cte来计算连续零的数量,然后使用级别&gt;创建一系列ID。 4然后简单地在id上执行一个not in子句。

with trend --work out number of consecutive zeros using level
as
(Select 1 as level, id, value, id as startid
    from IdsAndValues
    Union All
    Select [Level]+1, P.ID, p.value, t.startid
    From IdsAndValues as p
        Inner Join trend as t on p.id = t.id+1
    Where t.value =0 and p.value=0
)
,IDs --create sequence of ids using startid and  id, this allows us to do the not in
as
(   
    Select  startid as ExcludeID ,id 
    from trend as t--
    Where level>=4
    Union All
    Select ExcludeID +1, id
    From ids 
    where ExcludeID <id 
)

Select *
from IdsAndValues
Where id Not in
    (Select ExcludeID from IDs)