选择包含值的列上的所有行

时间:2016-05-05 14:56:53

标签: postgresql

我有一个名为技能的表,它有两列:developer_id和language_id。 我想让所有拥有一组language_id的开发人员(开发人员必须全部返回)。

我尝试了两种方法:

SELECT developer_id FROM skills WHERE language_id = 256
INTERSECT
SELECT developer_id FROM skills WHERE language_id = 85

HashSetOp Intersect  (cost=24192.94..422840.17 rows=114424 width=4)
  ->  Append  (cost=24192.94..413497.17 rows=3737200 width=4)
        ->  Subquery Scan on "*SELECT* 1"  (cost=24192.94..183000.11 rows=1292452 width=4)
              ->  Bitmap Heap Scan on skills  (cost=24192.94..170075.59 rows=1292452 width=4)
                    Recheck Cond: (language_id = 256)
                    ->  Bitmap Index Scan on skill_dev_lang_idx  (cost=0.00..23869.83 rows=1292452 width=0)
                          Index Cond: (language_id = 256)
        ->  Subquery Scan on "*SELECT* 2"  (cost=45763.23..230497.06 rows=2444748 width=4)
              ->  Bitmap Heap Scan on skills skills_1  (cost=45763.23..206049.58 rows=2444748 width=4)
                    Recheck Cond: (language_id = 85)
                    ->  Bitmap Index Scan on skill_dev_lang_idx  (cost=0.00..45152.05 rows=2444748 width=0)
                          Index Cond: (language_id = 85)

SELECT developer_id FROM skills
WHERE language_id IN (256,85)
group by developer_id
having count(*) = 2

HashAggregate  (cost=262124.17..266259.96 rows=330863 width=4)
  Group Key: developer_id
  Filter: (count(*) = 2)
  ->  Bitmap Heap Scan on skills  (cost=66996.18..243438.17 rows=3737200 width=4)
        Recheck Cond: (language_id = ANY ('{256,85}'::integer[]))
        ->  Bitmap Index Scan on skill_dev_lang_idx  (cost=0.00..66061.88 rows=3737200 width=0)
              Index Cond: (language_id = ANY ('{256,85}'::integer[]))

但这两个都很慢(3-4秒)。

我在dev_id和language_id上​​都有一个索引。有大约3000万行。

1 个答案:

答案 0 :(得分:0)

您的查询正在计算所有3M记录的分组和计数(通过索引扫描从您的总计30M +中过滤),然后按计数(*)= 2进行过滤。或许以下情况会更好:

MyApp.exe (~16KB)
MyApp.dll (~15MB)