假设我要为连续X列的表格计算总体中位数。可以使用以下代码段:
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY X)
OVER (PARTITION BY
?
)
AS grand_median
但是,OVER PARTITION BY部分是强制性的。为什么当我要计算中位数时为什么?我该怎么办?谢谢!
PS:只是添加一些人工数据-已经有了@PawełDyl启发的答案
IF OBJECT_ID('tempdb..#Data') IS NOT NULL
DROP TABLE #Data
CREATE TABLE #Data
(
Number FLOAT,
)
INSERT INTO #Data (Number) VALUES (30);
INSERT INTO #Data (Number) VALUES (20);
INSERT INTO #Data (Number) VALUES (42);
INSERT INTO #Data (Number) VALUES (42);
INSERT INTO #Data (Number) VALUES (42);
INSERT INTO #Data (Number) VALUES (43);
INSERT INTO #Data (Number) VALUES (40);
SELECT * FROM #Data
SELECT DISTINCT PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY Number) OVER() FROM #Data
一些R代码可以对此进行“测试”:
test <- c(30, 20, 42, 42, 42, 42, 40)
median(test)
正确答案当然是42。
答案 0 :(得分:1)
OVER
是必需的,PARTITION BY
不是必需的。请参见MSDN和以下演示:
DECLARE @table TABLE
(
X int
)
INSERT @table VALUES (1),(2),(3),(4),(5),(10),(12)
SELECT PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY X) OVER() FROM @table