给出一组行,其中一个字段有时为null
,有时不为:
SELECT
Date, TheThing
FROM MyData
ORDER BY Date
Date TheThing
----------------------- --------
2016-03-09 08:17:29.867 a
2016-03-09 08:18:33.327 a
2016-03-09 14:32:01.240 NULL
2016-10-21 19:53:49.983 NULL
2016-11-12 03:25:21.753 b
2016-11-24 07:43:24.483 NULL
2016-11-28 16:06:23.090 b
2016-11-28 16:09:07.200 c
2016-12-10 11:21:55.807 c
我想要一个排名列来计算非空值:
Date TheThing DesiredTotal
----------------------- -------- ------------
2016-03-09 08:17:29.867 a 1
2016-03-09 08:18:33.327 a 2
2016-03-09 14:32:01.240 NULL 2 <---notice it's still 2 (good)
2016-10-21 19:53:49.983 NULL 2 <---notice it's still 2 (good)
2016-11-12 03:25:21.753 b 3
2016-11-24 07:43:24.483 NULL 3 <---notice it's still 3 (good)
2016-11-28 16:06:23.090 b 4
2016-11-28 16:09:07.200 c 5
2016-12-10 11:21:55.807 c 6
我尝试明显的方法:
SELECT
Date, TheThing,
RANK() OVER(ORDER BY Date) AS Total
FROM MyData
ORDER BY Date
但是RANK()
计数为空:
Date TheThing Total
----------------------- -------- -----
2016-03-09 08:17:29.867 a 1
2016-03-09 08:18:33.327 a 2
2016-03-09 14:32:01.240 NULL 3 <--- notice it is 3 (bad)
2016-10-21 19:53:49.983 NULL 4 <--- notice it is 4 (bad)
2016-11-12 03:25:21.753 b 5 <--- and all the rest are wrong (bad)
2016-11-24 07:43:24.483 NULL 7
2016-11-28 16:06:23.090 b 8
2016-11-28 16:09:07.200 c 9
2016-12-10 11:21:55.807 c 10
我如何指示RANK()
(或DENSE_RANK()
)不计算空值?
为什么是!更糟糕的是:
SELECT
Date, TheThing,
RANK() OVER(PARTITION BY(CASE WHEN TheThing IS NOT NULL THEN 1 ELSE 0 END) ORDER BY Date) AS Total
FROM MyData
ORDER BY Date
但是RANK()
计数为空:
Date TheThing Total
----------------------- -------- -----
2016-03-09 08:17:29.867 a 1
2016-03-09 08:18:33.327 a 2
2016-03-09 14:32:01.240 NULL 1 <--- reset to 1?
2016-10-21 19:53:49.983 NULL 2 <--- why go up?
2016-11-12 03:25:21.753 b 3
2016-11-24 07:43:24.483 NULL 3 <--- didn't reset?
2016-11-28 16:06:23.090 b 4
2016-11-28 16:09:07.200 c 5
2016-12-10 11:21:55.807 c 6
现在我随机输入东西-疯狂的弹奏。
SELECT
Date, TheThing,
RANK() OVER(PARTITION BY(CASE WHEN TheThing IS NOT NULL THEN 1 ELSE NULL END) ORDER BY Date) AS Total
FROM MyData
ORDER BY Date
SELECT
Date, TheThing,
DENSE_RANK() OVER(PARTITION BY(CASE WHEN TheThing IS NOT NULL THEN 1 ELSE NULL END) ORDER BY Date) AS Total
FROM MyData
ORDER BY Date
编辑:有了所有答案,花了很多次迭代才能找到我不需要的所有边缘情况。最后,我概念上想要的是OVER()
,以便计数。我不知道OVER
是否适用于RANK
(和DENSE_RANK
)以外的任何事物。
http://sqlfiddle.com/#!18/c6d87/1
答案 0 :(得分:4)
我认为您正在寻找累计数量:
SELECT Date, TheThing,
COUNT(theThing) OVER (ORDER BY Date) AS Total
FROM MyData
ORDER BY Date;
答案 1 :(得分:3)
尝试一下:
declare @tbl table (dt datetime, col int);
insert into @tbl values
('2016-03-09 08:17:29.867', 1),
('2016-03-09 08:18:33.327', 1),
('2016-03-09 14:32:01.240', NULL),
('2016-10-21 19:53:49.983', NULL),
('2016-11-12 03:25:21.753', 1),
('2016-11-24 07:43:24.483', NULL),
('2016-11-28 16:06:23.090', 1),
('2016-11-28 16:09:07.200', 1),
('2016-12-10 11:21:55.807', 1);
select dt,
col,
sum(case when col is null then 0 else 1 end) over (order by dt) rnk
from @tbl
这个想法真的很简单:如果您将1分配给非null值,将0分配给该列为null的位置,则按日期排序的累积总和与排除null的排名完全一样。
其他方法是将RANK
与ROW_NUMBER
结合使用,这将尊重Date
列中的联系,并且与RANK
尊重NULL
的工作方式完全相同:
select dt,
col,
case when col is not null then
rank() over (order by dt)
else
rank() over (order by dt) - row_number() over (partition by rnDiff order by dt)
end rnk
from (
select dt,
col,
row_number() over (order by dt) -
row_number() over (partition by coalesce(col, 0) order by dt) rnDiff
from @tbl
) a
order by dt
答案 2 :(得分:1)
我的蜥蜴脑把我带到这里... sum()vs rank()
Select *
,NewCol = sum(sign(TheThing)) over (Order by Date)
,OrEven = sum(TheThing/TheThing) over (Order by Date)
From MyData
返回
答案 3 :(得分:1)
从NULL
中减去rank()
的当前计数怎么办?
SELECT date,
thething,
rank() OVER (ORDER BY date)
-
sum(CASE
WHEN thething IS NULL THEN
1
ELSE
0
END) OVER (ORDER BY date) desiredtotal
FROM mydata;
这还应保留rank()
产生的重复项和空白,并且不需要子查询。
答案 4 :(得分:0)
我会使用subquery
:
SELECT [Date], TheThing,
(SELECT COUNT(*)
FROM MyData m
WHERE m.[Date] <= m1.[Date] AND m.TheThing IS NOT NULL
) AS DesiredTotal
FROM MyData m1;
以类似的方式,您也可以尝试使用apply
:
SELECT *
FROM MyData m1 CROSS APPLY
(SELECT COUNT(*) AS DesiredTotal
FROM MyData m
WHERE m.[Date] <= m1.[Date] AND m.TheThing IS NOT NULL
) m2;
答案 5 :(得分:0)
我使用CTE首先获取正确的日期,然后将排名应用于修改后的日期:
CREATE TABLE #tmp(dt datetime, TheThing int)
INSERT INTO #tmp VALUES('2016-03-09 08:17:29.867', 1)
INSERT INTO #tmp VALUES('2016-03-09 08:18:33.327', 1)
INSERT INTO #tmp VALUES('2016-03-09 14:32:01.240', NULL)
INSERT INTO #tmp VALUES('2016-10-21 19:53:49.983', NULL)
INSERT INTO #tmp VALUES('2016-11-12 03:25:21.753', 1)
INSERT INTO #tmp VALUES('2016-11-24 07:43:24.483', NULL)
INSERT INTO #tmp VALUES('2016-11-28 16:06:23.090', 1)
INSERT INTO #tmp VALUES('2016-11-28 16:09:07.200', 1)
INSERT INTO #tmp VALUES('2016-12-10 11:21:55.807', 1)
;WITH CTE as (
SELECT
CASE WHEN TheThing IS NULL THEN (SELECT MAX(dt) from #tmp OrigTbl where OrigTbl.dt < SubTbl.dt and OrigTbl.TheThing IS NOT NULL) ELSE dt end dtMod,
SubTbl.dt,SubTbl.TheThing
from #tmp SubTbl)
SELECT dt, TheThing, DENSE_RANK() over(ORDER BY dtMod) from CTE