我有一组带有一组值的表,表示例是
ID | Customer_name | workorder
1 | abc | dispatch
2 | xyz | not_dispatch
3 | jdk | dispatch
并且这总共持续了1M行..现在我想将这个数据集采样到5000行,我希望3400个工作器为“not_dispatch”,1600个样本中带有“dispatch”。 如何在PostgreSQL中完成。
答案 0 :(得分:1)
远非有效但有效:
SELECT *
FROM (
SELECT * FROM my_table
WHERE workorder = 'dispatch' -- other filters
ORDER BY random() LIMIT 1600) sub1
UNION
SELECT *
FROM (
SELECT * FROM my_table
WHERE workorder = 'not_dispatch' -- other filters
ORDER BY random() LIMIT 3400) sub2;