首先,抱歉问题标题。无论可能是什么,我都没有统计用语或这种连接难度。
我有一个查询*,基本上我生成了三件事...... random_sex
,random_first
和random_last
。我正在尝试加入this method。
random_sex | random_first | random_last
------------+------------------+------------------
male | 47.7101715711225 | 24.3833348881337
male | 72.8463141907472 | 28.3560050522089
female | 72.8617294209544 | 33.3203859277759
male | 39.3406164890062 | 26.3352867371729
female | 28.6855500966031 | 65.8870893270099
female | 35.5960198949557 | 83.1188118207422
male | 11.5711074977927 | 10.544433838184
male | 15.6900786811765 | 18.7324617852545
male | 24.9860797089245 | 8.98265511383023
female | 80.4563122882508 | 35.594445341751
(10 rows)
基本上人口普查数据就像这样......
name | freq | cumfreq | rank | name_type
------------+-------+---------+------+-----------
SMITH | 1.006 | 1.006 | 1 | LAST
JOHNSON | 0.81 | 1.816 | 2 | LAST
WILLIAMS | 0.699 | 2.515 | 3 | LAST
JONES | 0.621 | 3.136 | 4 | LAST
BROWN | 0.621 | 3.757 | 5 | LAST
DAVIS | 0.48 | 4.237 | 6 | LAST
MILLER | 0.424 | 4.66 | 7 | LAST
WILSON | 0.339 | 5 | 8 | LAST
MOORE | 0.312 | 5.312 | 9 | LAST
TAYLOR | 0.311 | 5.623 | 10 | LAST
ANDERSON | 0.311 | 5.934 | 11 | LAST
THOMAS | 0.311 | 6.245 | 12 | LAST
JACKSON | 0.31 | 6.554 | 13 | LAST
WHITE | 0.279 | 6.834 | 14 | LAST
HARRIS | 0.275 | 7.109 | 15 | LAST
MARTIN | 0.273 | 7.382 | 16 | LAST
THOMPSON | 0.269 | 7.651 | 17 | LAST
GARCIA | 0.254 | 7.905 | 18 | LAST
MARTINEZ | 0.234 | 8.14 | 19 | LAST
而且,在这种情况下..
random_sex | random_first | random_last
male | 47.7101715711225 | 24.3833348881337
我希望它像这样(程序性地)加入:
=# select * from census.names where cumfreq > 47.7101715711225 AND name_type = 'MALE_FIRST' order by cumfreq asc limit 1;
name | freq | cumfreq | rank | name_type
--------+-------+---------+------+------------
SILVER | 0.009 | 47.717 | 1424 | MALE_FIRST
=# select * from census.names where cumfreq > 24.3833348881337 AND name_type = 'LAST' order by cumfreq asc limit 1;
name | freq | cumfreq | rank | name_type
--------+-------+---------+------+-----------
HARPER | 0.054 | 24.408 | 185 | LAST
所以这个男士的名字就是Silver Harper。我一生中从未见过一个,但是they do exist.
我想在上面的查询中返回“Silver”“Harper”而不是随机数。我怎样才能让它像这样工作?
FOOTNOTE
*:只是为了保持简单:
SELECT
CASE WHEN RANDOM() > 0.5 THEN 'male' ELSE 'female' END AS random_sex
, RANDOM() * 90.020 AS random_first -- dataset is 90% of most popular
, RANDOM() * 90.483 AS random_last
FROM generate_series(1,10,1);
答案 0 :(得分:2)
我实际上也不了解统计数据。但我认为这就是你想要的
让我们为返回随机列Randoms
WITH RANDOMS AS
(
SELECT
CASE WHEN RANDOM() > 0.5 THEN 'male' ELSE 'female' END AS random_sex
, RANDOM() * 90.020 AS random_first
, RANDOM() * 90.483 AS random_last
FROM generate_series(1,10,1)
)
SELECT (
SELECT A.NAME
FROM census.names A
WHERE A.cumfreq > R.random_first
AND A.name_type = 'MALE_FIRST'
order by A.cumfreq asc limit 1
),
(
SELECT A.NAME
FROM census.names A
WHERE A.cumfreq > R.random_last
AND A.name_type = 'LAST'
order by A.cumfreq asc limit 1
) AS NAME
FROM RANDOMS R ;
答案 1 :(得分:0)
相关的子查询?
SELECT
*
FROM
yourRandomTable
INNER JOIN
census.names AS first_name
ON first_name.cumfreq = (SELECT MIN(cumfreq)
FROM census.names
WHERE cumfreq > yourRandomTable.random_first
AND type = yourRandomTable.random_sex + '_FIRST')
AND first_name.type = yourRandomTable.random_sex + '_FIRST'
INNER JOIN
census.names AS last_name
ON last_name.cumfreq = (SELECT MIN(cumfreq)
FROM census.names
WHERE cumfreq > yourRandomTable.random_last
AND type = 'LAST')
AND last_name.type = 'LAST'
你可以改变这种模式。具体如何选择这取决于您如何设置索引。
答案 2 :(得分:0)
EXPLAIN ANALYZE SELECT
r.sex
, r.detail
, COALESCE(
(SELECT name FROM census.names AS mf WHERE r.sex = 'male' AND mf.name_type = 'MALE_FIRST' AND mf.cumfreq > r.first ORDER BY cumfreq LIMIT 1)
, (SELECT name FROM census.names AS ff WHERE r.sex = 'female' AND ff.name_type = 'FEMALE_FIRST' AND ff.cumfreq > r.first ORDER BY cumfreq LIMIT 1)
) AS first
, (SELECT name FROM census.names AS l WHERE l.name_type = 'LAST' AND l.cumfreq > r.last ORDER BY cumfreq LIMIT 1) AS last
FROM (
SELECT
RANDOM() * 90.020 AS first
, RANDOM() * 90.483 AS last
, CASE WHEN RANDOM() > 0.5 THEN 'male' ELSE 'female' END AS sex
FROM generate_series(1,10,1)
) AS r;
这实际上就是我最终的目标。
答案 3 :(得分:-1)
作弊,笛卡尔产品
Select q1.Name as Forename, q2.Name as Surname
From
(select Name from census.names where cumfreq > 47.7101715711225
AND name_type = 'MALE_FIRST' order by cumfreq asc limit 1) q1,
(select Name from census.names where cumfreq > 24.3833348881337
AND name_type = 'LAST' order by cumfreq asc limit 1) q2