我有以下数据集:
Ranking Segment Month
1 1 201501
2 1 201501
3 1 201501
4 1 201501
5 1 201501
6 1 201501
7 1 201501
… 1 201501
567 1 201501
1 2 201501
2 2 201501
3 2 201501
4 2 201501
….. 2 201501
456 2 201501
1 1 201502
2 1 201502
3 1 201502
4 1 201502
5 1 201502
6 1 201502
7 1 201502
… 1 201502
326 1 201502
1 2 201502
2 2 201502
3 2 201502
4 2 201502
… 2 201502
562 2 201502
...........
我需要将每个细分受众群分成5%的细分。由于每个细分市场每个月的销售数量不同,您能否告诉我如何将每个细分市场分成20个组,其中5%的排序按排名分类?
谢谢!
答案 0 :(得分:0)
一种方法是使用count(*)
作为窗口函数。目前还不清楚每组的样本类型。
以下是第一个" n"样本,假设ranking
没有间隙或重复:
select t.*,
floor( 20 * (ranking - 1) / cnt) as grp
from (select t.*, count(*) over (partition by segment, month) as cnt
from t
) t;
如果存在空白或重复,您可以使用row_number()
来获取正确的枚举。
答案 1 :(得分:0)
根据您的输入数据,您不会说出您的预期输出。也许ntile()
就是你所追求的,例如:
WITH sample_data AS (SELECT LEVEL ranking, 1 SEGMENT, 201501 mnth FROM dual CONNECT BY LEVEL <= 21 UNION ALL
SELECT LEVEL ranking, 2 SEGMENT, 201501 mnth FROM dual CONNECT BY LEVEL <= 40 UNION ALL
SELECT LEVEL ranking, 1 SEGMENT, 201502 mnth FROM dual CONNECT BY LEVEL <= 60 UNION ALL
SELECT LEVEL ranking, 2 SEGMENT, 201502 mnth FROM dual CONNECT BY LEVEL <= 80)
SELECT ranking,
segment,
mnth,
NTILE(20) OVER (PARTITION BY mnth, SEGMENT ORDER BY ranking) grp
FROM sample_data;
RANKING SEGMENT MNTH GRP
---------- ---------- ---------- ----------
1 1 201501 1
2 1 201501 1
3 1 201501 2
4 1 201501 3
...
19 1 201501 18
20 1 201501 19
21 1 201501 20
1 2 201501 1
2 2 201501 1
3 2 201501 2
4 2 201501 2
5 2 201501 3
...
36 2 201501 18
37 2 201501 19
38 2 201501 19
39 2 201501 20
40 2 201501 20
1 1 201502 1
2 1 201502 1
3 1 201502 1
4 1 201502 2
5 1 201502 2
6 1 201502 2
7 1 201502 3
...
54 1 201502 18
55 1 201502 19
56 1 201502 19
57 1 201502 19
58 1 201502 20
59 1 201502 20
60 1 201502 20
1 2 201502 1
2 2 201502 1
3 2 201502 1
4 2 201502 1
5 2 201502 2
6 2 201502 2
7 2 201502 2
8 2 201502 2
9 2 201502 3
...
72 2 201502 18
73 2 201502 19
74 2 201502 19
75 2 201502 19
76 2 201502 19
77 2 201502 20
78 2 201502 20
79 2 201502 20
80 2 201502 20
N.B。 ntile将首先填充早期的桶,例如对于上面我的例子中的段1和月201501,grp = 1有2行,而所有其他行都有1.如果你想要它以便后面的桶被填满更多,你可以简单地反转ntile的顺序,然后从桶的数量减去+ 1,所以在你的情况下,那将是:
21 - NTILE(20) OVER (PARTITION BY mnth, SEGMENT ORDER BY ranking DESC)
答案 2 :(得分:0)
您正在寻找NTILE(20)
,它将数据分成5%的块。
select ranking, segment, month
from
(
select
ranking, segment, month,
ntile(20) over (partition by segment, month order by ranking) as tile
from mytable
)
where tile = 1
order by month, segment, ranking;