PostgreSQL选择不在列表中加入

时间:2015-09-01 18:34:10

标签: sql postgresql join

该项目正在使用Postgres 9.3

我有表格(我简化了)如下:

t_person (30 million records)
- id
- first_name
- last_name
- gender

t_city (70,000 records)
- id
- name
- country_id

t_country (20 records)
- id
- name

t_last_city_visited (over 200 million records)
- person_id
- city_id
- country_id
  - There is a unique constraint on person_id, country_id to
    ensure that each person only has one last city per country

我需要做的是以下内容的变体:

获取访问过国家/地区的女性的ids' UK'   但从未访问过国家' USA'

我尝试了以下内容,但速度太慢了。

select t_person.id from t_person
join t_last_city_visited
  on (
          t_last_city_visited.person_id = t_person.id
          and country_id = (select id from t_country where name = 'UK')
     )
where gender = 'female'
except
(
    select t_person.id from t_person
    join t_last_city_visited
      on (
             t_last_city_visited.person_id = t_person.id
             and country_id = (select id from t_country where name = 'USA')
         )
)

我真的很感激任何帮助。

3 个答案:

答案 0 :(得分:3)

提示:你想在这里做的是找到那些有访问英国的女性,但不会访问美国。

类似的东西:

select ...
from   t_person
where  ...
   and exists (select null
                 from t_last_city_visited join
                      t_country on (...)
                where t_country.name = 'UK')
   and not exists (select null
                 from t_last_city_visited join
                      t_country on (...)
                where t_country.name = 'US')

另一种方法是找到访问过英国而不是美国的人,然后您可以加入人们按性别进行过滤:

select   person_id
  from   t_last_city_visited join
         t_country on t_last_city_visited.country_id = t_country.id
 where   t_country.name in ('US','UK')
group by person_id
having   max(t_country.name) = 'UK'

答案 1 :(得分:0)

请问您可以运行分析并执行此查询吗?

-- females who visited UK
with uk_person as (
  select distinct person_id
  from t_last_city_visited t
  inner join t_person p on t.person_id = p.id and 'F' = p.gender
  where country_id  = (select id from t_country where name = 'UK')
),
-- females who visited US
us_person as (
  select distinct person_id
  from t_last_city_visited t
  inner join t_person p on t.person_id = p.id and 'F' = p.gender
  where country_id  = (select id from t_country where name = 'US')
)
-- females who visited UK but not US
select uk.person_id
from uk_person uk
left join us_person us on uk.person_id = us.person_id
where us.person_id is null

这是可以形成此查询的众多方法之一。您可能必须运行它们以找出哪个最有效并且可能需要进行索引调整以使它们运行得更快。

答案 2 :(得分:0)

这是我接近它的方式,你可以稍后用别名替换内部查询,如@zedfoxus所说

select 
    id 
from 
    (SELECT
        p.id id
    FROM
        t_person p JOIN t_last_city_visited lcv
            ON(lcv.person_id = p.id)
        JOIN country c
            ON(lcv.country_id = c.id  and cname = 'UK')
    WHERE
        p.gender = 'female') v JOIN

    (SELECT
        p2.id id
    FROM
        t_person p2 JOIN t_last_city_visited lcv2
            ON(lcv2.person_id = p2.id)
        JOIN country c
            ON(lcv.country_id = c.id  and cname != 'USA')
    WHERE
        p.gender = 'female') nv

        ON(v.id = nv.id)