Postgresql LARGE查询优化

时间:2017-06-22 14:13:54

标签: sql postgresql optimization query-optimization

我在Postgresql中查询了一些问题。此查询需要很长时间才能执行(没有缓冲区大约30秒) 我的问题在这里:

SELECT  d.name, COUNT (*) AS cnt,
            'first' AS TYPE
        FROM
            tableA a
        INNER JOIN tableD d ON d.NAME = 'FOO'
        AND a.key = d.key
        WHERE
            a.DATE > '2017-06-01'
        AND a.DATE < '2017-07-01'
        group by d.name
UNION ALL
    SELECT
        d.name,
        COUNT (*) AS cnt,
        'second' AS TYPE
    FROM
        tableB b
    INNER JOIN tableD d ON d.NAME = 'FOO'
    AND b.key = d.key
    WHERE
        b.DATE > '2017-06-01'
    AND b.DATE < '2017-07-01'
    group by d.name
UNION ALL
    SELECT
        d.name,
        COUNT (*) AS cnt,
        'Third' AS TYPE
    FROM
        tableC c
    INNER JOIN tableD d ON d.NAME = 'FOO'
    AND c.key = d.key
    WHERE
        c.date > '2017-06-01'
    AND c.date < '2017-07-01'
    group by d.name

我在tableC.key(Btree)和tableC.name(Hash)上创建了索引 此外,其他表具有日期和密钥(Btree)

的索引

因此我的查询可以通过索引加入,并可以按索引进行过滤

我的tableD有几千行,其他有数十亿甚至几十亿

在执行计划中,我看到执行程序使用嵌套循环我的所有连接(期望在B-D连接中有一个,有一个散列连接)

也许我找到了“背叛者”

Node Type": "Bitmap Heap Scan",
        "Parent Relationship": "Inner",
        "Relation Name": "tableA",
        "Alias": "a",
        "Startup Cost": 2469.84,
        "Total Cost": 137625.61,
        "Plan Rows": 53748,
        "Plan Width": 37,
        "Recheck Cond": "(((key)::text = (d.key)::text) AND (date > '2017-06-01 00:00:00'::timestamp without time zone) AND (date < '2017-07-01 00:00:00'::timestamp without time zone))",
                "Plans": [{
                    "Node Type": "Bitmap Index Scan",
                    "Parent Relationship": "Outer",
                    "Index Name": "\"date + key\"",
                    "Startup Cost": 0.00,
                    "Total Cost": 2456.40,
                    "Plan Rows": 53748,
                    "Plan Width": 0,
                    "Index Cond": "(((key)::text = (d.key)::text) AND (date > '2017-06-01 00:00:00'::timestamp without time zone) AND (date < '2017-07-01 00:00:00'::timestamp without time zone))"
                            }]

提出:

    CREATE TABLE "sch"."tableD" (
    "id" int4 NOT NULL,
    "key" varchar(36) COLLATE "default",
    "name" varchar(255) COLLATE "default",


    CREATE INDEX "license_key" ON "sch"."tableD" USING btree ("key");
    CREATE INDEX "name" ON "sch"."tableD" USING btree ("name");

表A:

    CREATE TABLE "sch"."tableA" (
    "id" int4 DEFAULT nextval('"sch".table'::regclass) NOT NULL,
    "key" varchar(255) COLLATE "default",
    "date" timestamp(6),

    CREATE INDEX "date" ON "sch"."tableA" USING btree ("date");
    CREATE INDEX "date + key" ON "sch"."tableA" USING btree ("key", "date")
    CREATE INDEX "keyIndex" ON "sch"."tableA" USING btree ("key");

表B和C类似于A

我不知道,为什么我在这里浪费时间。你能帮我解决一下我的问题,这个查询不应该运行30秒 谢谢

1 个答案:

答案 0 :(得分:0)

提供这些 BTree 索引(哈希):

b: (DATE, key)
b: (key, DATE)
d: (NAME, key)
d: (key, NAME)

看起来像一个月的时间跨度,但你排除了月初。将>更改为>=