选择大表中的数据,其中某些ID不存在于相同的列中。加快查询速度

时间:2017-11-10 21:08:40

标签: postgresql indexing query-optimization

我试图在表格中为不同类型/ ID数据的公司和日期选择数据。

换句话说,如果同一company_id, dates_id, daily_val组合不存在wh_calc_id = 344,我希望company_id/dates_id位于wh_calc_id = 368

我松散地遵循这个例子: Select rows which are not present in other table

这是我的两次尝试:

尝试1:

SELECT distinct on (company_id, dates_id) company_id, dates_id, daily_val
FROM   daily_data d1
WHERE  NOT EXISTS (
                       SELECT 1              
                       FROM   daily_data d2
                       WHERE  d1.company_id = d2.company_id
                              and d1.dates_id = d2.dates_id
                              and d1.wh_calc_id = 368
                              and d2.wh_calc_id = 368
                   )
    and d1.wh_calc_id = 344

问题: 它超级慢:27分钟

尝试2:[删除]

一体化(巨型)表: company_id int(已编入索引), dates_id int(已编入索引), wh_calc_id int(索引), daily_val数字

我愿意添加一个有助于加快速度的索引,但是索引是什么?

Postgres 10

PS - 我必须在完成之前杀死这两个查询,所以我真的不知道它们是否写得正确。希望我的描述有所帮助。

2 个答案:

答案 0 :(得分:0)

我会这样做左联:

SELECT distinct on (company_id, dates_id) company_id, dates_id, daily_val FROM daily_data d1 LEFT JOIN daily_data d2 ON d1.company_id = d2.company_id and d1.dates_id = d2.dates_id and d1.wh_calc_id = 368 and d2.wh_calc_id = 368 WHERE d1.wh_calc_id = 344 AND d2.company_id IS NULL;

并在要使用的列上创建索引:

Create index on table daily_data ( company_id, dates_id, wh_calc_id);

答案 1 :(得分:0)

这就是我想要的想法:

   SELECT 
    d1.* 
from 
    daily_data d1
LEFT JOIN
    daily_data d2
ON
    d1.company_id     = d2.company_id
    AND d1.dates_id   = d2.dates_id
    AND d2.wh_calc_id = 368
    AND d1.wh_calc_id = 344

where
   and  d1.wh_calc_id = 344
   and d2.wh_calc_id is null