我的表中有以下数据:
SELECT category, value FROM test
| category | value | +----------+-------+ | 1 | 1 | | 1 | 3 | | 1 | 4 | | 1 | 8 |
现在我正在使用两个单独的查询。
获得平均值:
SELECT category, avg(value) as Average
FROM test
GROUP BY category
| category | value | +----------+-------+ | 1 | 4 |
获得中位数:
SELECT DISTINCT category,
PERCENTILE_CONT(0.5)
WITHIN GROUP (ORDER BY value)
OVER (partition BY category) AS Median
FROM test
| category | value | +----------+-------+ | 1 | 3.5 |
有没有办法在一个查询中合并它们?
注意:我知道我也可以使用两个子查询获得中位数,但我更喜欢使用PERCENTILE_CONT函数来获取它。
答案 0 :(得分:8)
AVG也是一个窗口函数:
select
distinct
category,
avg(value) over (partition by category) as average,
PERCENTILE_CONT(0.5)
WITHIN GROUP (ORDER BY value)
OVER (partition BY category) AS Median
from test
答案 1 :(得分:0)
我会以稍微不同的方式处理这个问题:
select category, avg(value) as avg,
avg(case when 2 * seqnum in (cnt, cnt + 1, cnt + 2) then value end) as median
from (select t.*, row_number() over (partition by category order by value) as seqnum,
count(*) over (partition by category) as cnt
from test t
) t
group by category;
答案 2 :(得分:0)
我希望对这个问题有一个更彻底的回答,经过一些挖掘后,在对Dwain Camps的多种方法的详尽分析中发现它,以防其他人发现它有用:
Calculating the Median Value within a Partitioned Set Using T-SQL
我去了"他的"第四个解决方案(他正在组合/测试其他人的方法),这很容易理解并且确实表现良好:
WITH Counts AS
(
SELECT ID, c=COUNT(*)
FROM #MedianValues
GROUP BY ID
)
SELECT a.ID, Median=AVG(0.+N)
FROM Counts a
CROSS APPLY
(
SELECT TOP(((a.c - 1) / 2) + (1 + (1 - a.c % 2)))
N, r=ROW_NUMBER() OVER (ORDER BY N)
FROM #MedianValues b
WHERE a.ID = b.ID
ORDER BY N
) p
WHERE r BETWEEN ((a.c - 1) / 2) + 1 AND (((a.c - 1) / 2) + (1 + (1 - a.c % 2)))
GROUP BY a.ID;