我有一张表如下
id| page | text
------------------------
1 | page1 | Hello World
2 | page1 | Foo Bar
3 | page2 | Baz Baz
3 | page2 | Some Text
4 | page3 | Some Other Text
我想选择2个随机条目 - 但每个页面只允许在结果中出现一次。
我有
SELECT * FROM mydata ORDER BY RANDOM(); LIMIT 2
但我可以将其与DISTINCT
或分组结合使用吗?
答案 0 :(得分:2)
类似的东西:
select id, page, text
from (
select id, page, text,
row_number() over (partition by page order by random()) as rn
from mydata
)
where rn <= 2
答案 1 :(得分:1)
如果你想:
...从基表中总共两行
...并且每个页面都有相同的机会出现在样本中,无论它在表格中有多少条目:
SELECT *
FROM (
SELECT DISTINCT ON (page) *
FROM mydata
ORDER BY page, random() -- pick one random entry per page
) x
ORDER BY random() -- pick two random pages
LIMIT 2;
或者,使用窗口功能:
WITH x AS (
SELECT *, row_number() OVER (PARTITION BY page ORDER BY random()) AS rn
FROM mydata
)
SELECT id, page, text
FROM x
WHERE rn = 1
ORDER BY random()
LIMIT 2;
你必须测试哪个更快 如果您正在处理大桌子并需要快速性能,那么您可以做得更好。 Here is one way how.
另一方面,如果你想要:
......表mydata
中共有两行
...并在示例中显示每个条目(几乎)相等的机会 a,从而有效地为表格中包含更多条目的页面提供更好的机会。 />
机会仍然不是真正平等 - 根据定义,您的限制会增加罕见页面输入的机会。
WITH x AS (
SELECT *
FROM mydata
ORDER BY random()
LIMIT 1
)
SELECT * FROM x
UNION ALL
(
SELECT m.*
FROM mydata m
, x
WHERE m.page <> x.page -- assuming page IS NOT NULL
ORDER BY random()
LIMIT 1
);
SELECT
的第二UNION
周围的括号必须允许个人订购。
使用PostgreSQL 9.1进行测试。窗口函数需要8.4或更高版本。
答案 2 :(得分:1)
与Erwin的回答相同,只是有点结构化:http://www.sqlfiddle.com/#!1/d3e83/6
with first_random as
(
select * from tbl order by random() limit 1
)
, second_random as
(
select *
from tbl
where page <> (select page from first_random)
order by random() limit 1
)
select * from first_random
union
select * from second_random;
与a_horse_with_no_name的答案相同,但这是正确的:http://www.sqlfiddle.com/#!1/d3e83/12
select id, page, text, rn
from (
select id, page, text,
row_number() over (partition by page order by random()) as rn
from tbl
) x
where rn = 1
order by random()
limit 2;
选择后者,它有更简单的执行计划
答案 3 :(得分:0)
这可能有用:
SELECT * FROM
(SELECT * FROM mydata GROUP BY page) t
ORDER BY RANDOM() LIMIT 2