检测最新的连续非空行

时间:2017-11-06 07:56:16

标签: sql sql-server sql-server-2016

我有一个这样的表,主键是id,日期组合 -

id  date    value
1   2002-02-28  NULL
1   2002-03-31  NULL
1   2002-04-30  10
1   2002-05-31  5
1   2002-10-31  4
1   2002-11-30  NULL
1   2002-12-31  0.7
1   2003-01-31  9
1   2003-02-28  NULL
2   2002-12-31  0.7
2   2003-11-30  0.10

我需要选择没有NULL的最新值序列(如果有的话)。预期的产出 -

id  date    value
1   2002-12-31  0.7
1   2003-01-31  9
2   2002-12-31  0.7
2   2003-11-30  0.10

说明:

  1. 对于id = 1,最新的NULL值出现在2003-02-28,最新的序列由2行组成,值为.7和9,因为在2002-11-30发现了另一个NULL,因此,所有以前的行被忽略。
  2. 对于id = 2,没有NULL值,所以我们取2行。
  3. 我有一个工作解决方案,有3-4个查询和其他一些计算。但我认为它可以在2个查询(或子查询)中完成。请记住,数据集很大,包含30到4,000万行。

4 个答案:

答案 0 :(得分:0)

;WITH CTE
    AS
    (
       SELECT id
             ,[date]
             ,Value
             ,ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID DESC,[DATE] DESC) AS Rnk
       FROM TableA
       WHERE Value IS NULL
    )
    SELECT *
    FROM TableA Res1
    WHERE ([Date] >= (SELECT [date] FROM CTE WHERE ID = Res1.ID AND Rnk = 2)
    AND [Date] <= (SELECT [date] FROM CTE WHERE ID = Res1.ID AND Rnk = 1))
    AND Value IS NOT NULL

答案 1 :(得分:0)

我确定必须有更清洁的方法来做到这一点,但我现在没有想出来:

 export default class List extends React.PureComponent {
constructor(props) {
    super(props);
    this.fetchData = this._fetchData.bind(this);
    this.state = {
        isLoading: true,
        isLoadingMore: false,
        _data: null,
        accessToken: ""
    };
}
async componentWillMount() {
    try {
        let accessToken = await AsyncStorage.getItem(ACCESS_TOKEN).then(
            JSON.parse
        );
        if (!accessToken) {
            this.redirect("login");
        } else {
            this.setState({ accessToken: accessToken });
        }
    } catch (error) {
        console.log("Something went wrong");
        this.redirect("login");
    }

    this.fetchData(responseJson => {
        const data = responseJson;
        this.setState({
            isLoading: false,
            _data: data
        });
    });
}

_fetchData(callback) {
    fetch(`https://website.com/posts?page=${page}&per_page=10`, {
        method: "GET",
        headers: {
            Accept: "application/json",
            "Content-Type": "application/json",
            Authorization: "Bearer " + this.state.accessToken.token
        }
    })
        .then(response => response.json())
        .then(callback)
        .catch(error => {
            console.error(error);
        });
}

从好的方面来说,我认为合理的问题可以清楚地遵循您的规范,因此它应该是正确的。结果:

declare @t table (id int not null, date date not null, value decimal(5,2) null)
insert into @t(id,date,value) values
(1,'20020228',NULL),    (1,'20020331',NULL),    (1,'20020430',10  ),
(1,'20020531',5   ),    (1,'20021031',4   ),    (1,'20021130',NULL),
(1,'20021231',0.7 ),    (1,'20030131',9   ),    (1,'20030228',NULL),
(2,'20021231',0.7 ),    (2,'20031130',0.10)

; With Latest as (
    select id,MAX(date) as date from @t where value is not null
    group by id
), LatestNull as (
    select t.id,MAX(t.date) as date
    from @t t
        inner join
        Latest l
            on
                t.id = l.id and
                t.date < l.date
    where
        t.value is null
    group by t.id
)
select
    *
from
    @t t
        left join
    LatestNull ln
        on
            t.id = ln.id and
            t.date > ln.date and
            t.value is not null
where ln.id is not null or
not exists(select * from LatestNull ln2 where ln2.id = t.id)

答案 2 :(得分:0)

您可以尝试递归CTE

with rownum_cte as
(
  select *, row_number() over (partition by id order by date desc) rn
  from @t
),
recursive_cte as
(
  select t1.*
  from rownum_cte t1
  where value is not null and date >= all
  (
    select date
    from rownum_cte t2
    where value is not null and t1.id= t2.id
  )
  union all 
  select rownum_cte.*
  from recursive_cte
  join rownum_cte on recursive_cte.id = rownum_cte.id and rownum_cte.value is not null and recursive_cte.rn + 1 = rownum_cte.rn
)
select recursive_cte.id, recursive_cte.date, recursive_cte.value
from recursive_cte
order by recursive_cte.id, recursive_cte.date

demo I - 我使用@Damien_The_Unbeliever脚本获取数据(谢谢)

我创建了另一个使用每id个最大日期的递归版本。因此,ALL构造被替换为计算这些最大值的子查询:

With rownum_cte as
(
  select t.*, row_number() over (partition by t.id order by t.date desc) rn
  from @t t
  join (
    select id, max(date) as maxdate 
    from @t 
    where value is not null
    group by id  
  ) lt on t.id= lt.id and t.date <= lt.maxdate 
),
recursive_cte as
(
  select t1.*
  from rownum_cte t1
  where value is not null and t1.rn = 1
  union all 
  select rownum_cte.*
  from recursive_cte
  join rownum_cte on recursive_cte.id = rownum_cte.id and rownum_cte.value is not null and recursive_cte.rn + 1 = rownum_cte.rn
)
select recursive_cte.id, recursive_cte.date, recursive_cte.value
from recursive_cte
order by recursive_cte.id, recursive_cte.date

demo II

答案 3 :(得分:0)

经过多次算法尝试后,我认为这可能会简化为以下内容......  这里,CTE识别MAX空日期,前一个NULL日期。根据CTE返回的id和日期,我INNER JOIN获取临时表中该范围内的相关日期。

DECLARE @temp TABLE (id int, date date, value decimal(12,2))

INSERT INTO @temp
VALUES
(1, '2002-02-28', NULL), (1, '2002-03-31', NULL), (1, '2002-04-30', 10), (1, '2002-05-31', 5),
(1, '2002-10-31', 4), (1, '2002-11-30', NULL), (1, '2002-12-31', 0.7), (1, '2003-01-31', 9),
(1, '2003-02-28', NULL), (2, '2002-12-31', 0.7), (2, '2003-11-30', 0.10);

WITH cte AS
(
    SELECT dT.*
          ,(SELECT MAX(date) FROM @temp T3 WHERE T3.id = dT.id AND T3.value IS NULL AND T3.date < dT.MaxNULLDate) [PreviousNullDate]              
      FROM (
            SELECT DISTINCT id
                  ,(SELECT MAX(date) FROM @temp T2 WHERE T2.id = T1.id AND T2.value IS NULL) [MaxNULLDate]    
              FROM @temp T1 
           ) AS dT
) 

SELECT T.*
  FROM cte INNER JOIN @temp T ON cte.id = T.id
                             AND T.value IS NOT NULL
                             AND T.date > COALESCE(cte.PreviousNullDate, '')
                             AND T.date < COALESCE(cte.MaxNULLDate, '9999-12-31')
ORDER BY cte.id, T.date

这给出了输出:

id  Start       value
1   2002-12-31  0.70
1   2003-01-31  9.00
2   2002-12-31  0.70
2   2003-11-30  0.10