我要选择以下细分。
Random 5500 rows including the following segments:
Subcategorie (sex): - 3300 men
- 2200 women
Subcategorie (age): - 2140 between 18-34 years
- 2100 between 35-54 years
- 1260 between 55-99 years
如何在选择语句中解决此问题?
答案 0 :(得分:2)
问题是,您使用“随机”一词,但是按年龄和性别划分的同类群组非常准确。真正随机的单个查询不会产生如此精确的配额。因此,您的查询必然很复杂:您需要将整个表划分为满足约束条件的子集,然后从这些子集中随机选择。像这样...
select * from (
select * from whatever
where sex = 'M'
and age between 18 and 34
order by dbms_random.value
)
where rownum <= 1284
union all
select * from (
select * from whatever
where sex = 'M'
and age between 35 and 54
order by dbms_random.value
)
where rownum <= 1260
union all select * from (
select * from whatever
where sex = 'M'
and age between 55 and 99
order by dbms_random.value
)
where rownum <= 756
union all
select * from (
select * from whatever
where sex = 'F'
and age between 18 and 34
order by dbms_random.value
)
where rownum <= 856
union all
select * from (
select * from whatever
where sex = 'F'
and age between 35 and 54
order by dbms_random.value
)
where rownum <= 840
union all select * from (
select * from whatever
where sex = 'F'
and age between 55 and 99
order by dbms_random.value
)
where rownum <= 504
这可能效果不佳,具体取决于通常的因素-表的大小,索引编制等-但它会产生那些确切的同类。
如果情况不太明显,则rownum
界限是每个年龄段的点击次数乘以男女之比(3:2)。