假设我有这张表:
+----+-------+
| id | value |
+----+-------+
| 1 | 5 |
| 2 | 4 |
| 3 | 1 |
| 4 | NULL |
| 5 | NULL |
| 6 | 14 |
| 7 | NULL |
| 8 | 0 |
| 9 | 3 |
| 10 | NULL |
+----+-------+
我想编写一个查询,用该表中表中的非空替换任何NULL
值。
我想要这个结果:
+----+-------+
| id | value |
+----+-------+
| 1 | 5 |
| 2 | 4 |
| 3 | 1 |
| 4 | 1 |
| 5 | 1 |
| 6 | 14 |
| 7 | 14 |
| 8 | 0 |
| 9 | 3 |
| 10 | 3 |
+----+-------+
如果不存在先前的值,则NULL为OK。理想情况下,即使使用ORDER BY
,这也应该能够正常工作。例如,如果我ORDER BY [id] DESC
:
+----+-------+
| id | value |
+----+-------+
| 10 | NULL |
| 9 | 3 |
| 8 | 0 |
| 7 | 0 |
| 6 | 14 |
| 5 | 14 |
| 4 | 14 |
| 3 | 1 |
| 2 | 4 |
| 1 | 5 |
+----+-------+
如果我ORDER BY [value] DESC
,那就更好了:
+----+-------+
| id | value |
+----+-------+
| 6 | 14 |
| 1 | 5 |
| 2 | 4 |
| 9 | 3 |
| 3 | 1 |
| 8 | 0 |
| 4 | 0 |
| 5 | 0 |
| 7 | 0 |
| 10 | 0 |
+----+-------+
我认为这可能涉及某种分析功能 - 以某种方式对值列进行分区 - 但我不确定在哪里查看。
答案 0 :(得分:2)
您可以使用运行总和来设置组,并使用max来填充空值。
select id,max(value) over(partition by grp) as value
from (select id,value,sum(case when value is not null then 1 else 0 end) over(order by id) as grp
from tbl
) t
将over()
子句更改为order by value desc
以获得问题的第二个结果。
答案 1 :(得分:2)
Itzik Ben-Gan在这里已经涵盖了最佳方式:The Last non NULL Puzzle
以下是针对1000万行的解决方案,并在我的系统上在20秒内完成
SELECT
id,
value1,
CAST(
SUBSTRING(
MAX(CAST(id AS binary(4)) + CAST(value1 AS binary(4)))
OVER (ORDER BY id
ROWS UNBOUNDED PRECEDING),
5, 4)
AS int) AS lastval
FROM dbo.T1;
此解决方案假定您的id列已编入索引
答案 2 :(得分:0)
如果NULL分散,我使用WHILE循环来填充它们
但是如果NULL在更长的连续字符串中,则有更快的方法。
所以这是一种方法:
首先找到我们想要更新的记录。它在此记录中为NULL,在先前记录中没有NULL
SELECT C.VALUE, N.ID
FROM TABLE C
INNER JOIN TABLE N
ON C.ID + 1 = N.ID
WHERE C.VALUE IS NOT NULL
AND N.VALUE IS NULL;
使用它来更新:(在这种语法上有点朦胧,但你明白了)
UPDATE N
SET VALUE = C.Value
FROM TABLE C
INNER JOIN TABLE N
ON C.ID + 1 = N.ID
WHERE C.VALUE IS NOT NULL
AND N.VALUE IS NULL;
..现在只是继续这样做,直到你用完行
-- This is needed to set @@ROWCOUNT to non zero
SELECT 1;
WHILE @@ROWCOUNT <> 0
BEGIN
UPDATE N
SET VALUE = C.Value
FROM TABLE C
INNER JOIN TABLE N
ON C.ID + 1 = N.ID
WHERE C.VALUE IS NOT NULL
AND N.VALUE IS NULL;
END
另一种方法是使用类似的查询来获取要更新的id的范围。如果你的NULLS通常是针对连续的id
,那么很多会更快地工作答案 3 :(得分:0)
以下是使用OUTER APPLY
CREATE TABLE #table(id INT, value INT)
INSERT INTO #table VALUES
(1,5),
(2,4),
(3,1),
(4,NULL),
(5,NULL),
(6,14),
(7,NULL),
(8,0),
(9,3),
(10,NULL)
SELECT t.id, ISNULL(t.value, t3.value) value
FROM #table t
OUTER APPLY(SELECT id FROM #table WHERE id = t.id AND VALUE IS NULL) t2
OUTER APPLY(SELECT TOP 1 value
FROM #table WHERE id <= t2.id AND VALUE IS NOT NULL ORDER BY id DESC) t3
<强>输出:强>
id VALUE
---------
1 5
2 4
3 1
4 1
5 1
6 14
7 14
8 0
9 3
10 3
答案 4 :(得分:0)
使用此样本数据:
if object_id('tempdb..#t1') is not null drop table #t1;
create table #t1 (id int primary key, [value] int null);
insert #t1 values(1,5),(2,4),(3,1),(4,NULL),(5,NULL),(6,14),(7,NULL),(8,0),(9,3),(10,NULL);
我想出了:
with x(id, [value], grouper) as (
select *, row_number() over (order by id)-sum(iif([value] is null,1,0)) over (order by id)
from #t1)
select id, min([value]) over (partition by grouper)
from x;
然而,我注意到,Vamsi Prabhala打败了我...我的解决方案与他发布的内容相同。 (arghhhh!)。所以我想我会尝试递归解决方案。这是一个非常有效的递归cte使用( ,只要ID被编入索引 ):
with sorted as (select *, seqid = row_number() over (order by id) from #t1),
firstRecord as (select top(1) * from #t1 order by id),
prev as
(
select t.id, t.[value], lastid = 1, lastvalue = null
from sorted t
where t.id = 1
union all
select t2.id, t2.[value], lastid+1, isnull(prev.[value],lastvalue)
from sorted t2
join prev on t2.id = prev.lastid+1
)
select id, [value]=isnull([value],lastvalue)--, *
from prev;
通常我不喜欢递归cte(简称rCte),但在这种情况下,它提供了一个优雅的解决方案,并且比使用窗口聚合函数更快(总和,最小...) 。注意执行计划,底部的rcte。 rCTE通过两次索引搜索完成,其中一次仅用于一行。与窗口聚合解决方案不同,rcte不需要排序。使用statistics io on
运行此操作; rcte产生的IO要少得多。
所有这些都说,不使用这些解决方案中的任何一个, TheGameiswar发布的内容将会表现最佳 。他在正确索引的id列上的解决方案将快速闪电。
答案 5 :(得分:0)
不要担心......给你的答案是:)
SELECT *
INTO #TempIsNOtNull
FROM YourTable
WHERE value IS NOT NULL
SELECT *
INTO #TempIsNull
FROM YourTable
WHERE value IS NULL
UPDATE YourTable
SEt YourTable.value = UpdateDtls.value
FROM YourTable
JOIN (
SELECT OuterTab1.id,
#TempIsNOtNull.value
FROM #TempIsNull OuterTab1
CROSS JOIN #TempIsNOtNull
WHERE OuterTab1.id - #TempIsNOtNull.id > 0
AND (OuterTab1.id - #TempIsNOtNull.id) = ( SELECT TOP 1
OuterTab1.id - #TempIsNOtNull.id
FROM #TempIsNull InnerTab
CROSS JOIN #TempIsNOtNull
WHERE OuterTab1.id - #TempIsNOtNull.id > 0
AND OuterTab1.id = InnerTab.id
ORDER BY (OuterTab1.id - #TempIsNOtNull.id) ASC) ) AS UpdateDtls
ON (YourTable.id = UpdateDtls.id)
答案 6 :(得分:0)
您也可以尝试使用correlated
子查询
select id,
case when value is not null then value else
(select top 1 value from table
where id < t.id and value is not null order by id desc) end value
from table t
结果:
id value
1 5
2 4
3 1
4 1
5 1
6 14
7 14
8 0
9 3
10 3
答案 7 :(得分:0)
可以使用UPDATE语句,请在使用前进行测试
update #table
set value = newvalue
from (
select
s.id, s.value,
(select top 1 t.value from #table t where t.id <= s.id and t.value is not null order by t.id desc) as newvalue
from #table S
) u
where #table.id = u.id and #table.value is null