我有一张下表。我想派生一列Flag
,以便每个分区的前90%的行将有TypeA
,其余的10%
行中将有TypeB
作为标志。
+------+----+
| City | id |
+------+----+
| A | 1A |
| A | 2A |
| A | 3A |
| A | 4A |
| A | 5A |
| B | 1B |
| B | 2B |
| B | 3B |
| B | 4B |
| B | 5B |
| B | 6B |
| D | 1D |
| D | 2D |
| D | 3D |
| D | 4D |
| D | 5D |
| D | 6D |
| D | 7D |
| D | 8D |
+------+----+
所需结果
+------+----+-------+
| City | id | Flag |
+------+----+-------+
| A | 1A | TypeA |
| A | 2A | TypeA |
| A | 3A | TypeA |
| A | 4A | TypeA | // Approximately Top 90% of rows for City A: Flag Type A
| A | 5A | TypeB | // Approximately below 10% of rows for City A: Flag Type B
| B | 1B | TypeA |
| B | 2B | TypeA |
| B | 3B | TypeA |
| B | 4B | TypeA |// Approximately Top 90% of rows for City B: Flag Type A
| B | 5B | TypeB |// Approximately below 10% of rows for City B: Flag Type B
| B | 6B | TypeB |
| D | 1D | TypeA |
| D | 2D | TypeA |
| D | 3D | TypeA |
| D | 4D | TypeA |
| D | 5D | TypeA |
| D | 6D | TypeA |
| D | 7D | TypeA |
| D | 8D | TypeB |
+------+----+-------+
任何帮助将不胜感激。
答案 0 :(得分:3)
一种方法是进行显式计数:
select t.*,
(case when row_number() over (partition by city order by id) <=
0.9 * count(*) over (partition by city)
then 'TypeA'
else 'TypeB'
end) as flag
from t
答案 1 :(得分:3)
这是一个使用COUNT
作为分析函数的选项:
SELECT
City,
id,
CASE WHEN COUNT(*) OVER (PARTITION BY City ORDER BY id) /
COUNT(*) OVER (PARTITION BY City) <= 0.9
THEN 'TypeA'
ELSE 'TypeB' END AS Flag
FROM yourTable
ORDER BY
City,
Id;
对COUNT
的第一次调用将按照Id
的顺序计算每个城市分区中直到当前行的元素数量。然后,我们通过每个城市的记录总数对其进行归一化,然后将其与0.9
进行比较,以确定要分配的标志。
答案 2 :(得分:2)
SQL Server具有percent_rank()窗口函数,可直接计算所需的数字,而无需自己执行:
SELECT City, id
, CASE
WHEN percent_rank() OVER (PARTITION BY City ORDER BY id) <= 0.9 THEN 'TypeA'
ELSE 'TypeB'
END AS Flag
FROM table1
ORDER BY City, id;