PostgreSQL - 如何计算查询计划中排序节点的成本?

时间:2017-05-11 13:10:07

标签: postgresql

我在postgreSQL中有以下查询计划:

Unique  (cost=487467.14..556160.88 rows=361546 width=1093)
  ->  Sort  (cost=487467.14..488371.00 rows=361546 width=1093)
        Sort Key: (..)
        ->  Append  (cost=0.42..108072.53 rows=361546 width=1093)
              ->  Index Scan using (..)  (cost=0.42..27448.06 rows=41395 width=1093)
                    Index Cond: (..)
                    Filter: (..)
              ->  Seq Scan on (..)  (cost=0.00..77009.02 rows=320151 width=1093)
                    Filter: (..)

我只是想知道如何完成 sort 中两个值的精确计算?我理解它如何用于扫描和附加,但我找不到任何有关排序成本计算的信息。

SeqScan之类的东西是:

(disk pages read * seq_page_cost) + (rows scanned * cpu_tuple_cost)

计划的查询基本上是这样的:(不完全是因为它包含一个视图,但你明白了)

SELECT * FROM (
  SELECT *, true AS storniert
    FROM auftragsposition
    WHERE mengestorniert > 0::numeric AND auftragbestaetigt = true
  UNION
  SELECT *, false AS storniert
    FROM auftragsposition
    WHERE mengestorniert < menge AND auftragbestaetigt = true
) as bla

1 个答案:

答案 0 :(得分:2)

src/backend/optimizer/path/costsize.c函数cost_sort()中实现(并记录,因为源代码通常是唯一的文档),并且基本成本类似于N * log(N)compare operations内存排序(基于磁盘的排序可能会更慢,并且估计其成本也是如此)。

预期此N * log(N):https://en.wikipedia.org/wiki/Sorting_algorithm#Efficient_sorts&#34; 一般排序算法几乎总是基于平均时间复杂度的算法... O(n log n)&#34):

https://github.com/postgres/postgres/blob/REL9_6_STABLE/src/backend/optimizer/path/costsize.c#L1409

/*
 * cost_sort
 *    Determines and returns the cost of sorting a relation, including
 *    the cost of reading the input data.
 *
 * If the total volume of data to sort is less than sort_mem, we will do
 * an in-memory sort, which requires no I/O and about t*log2(t) tuple
 * comparisons for t tuples.
 *
 * If the total volume exceeds sort_mem, we switch to a tape-style merge
 * algorithm.  There will still be about t*log2(t) tuple comparisons in
 * total, but we will also need to write and read each tuple once per
 * merge pass.  We expect about ceil(logM(r)) merge passes where r is the
 * number of initial runs formed and M is the merge order used by tuplesort.c.
 * Since the average initial run should be about sort_mem, we have
 *      disk traffic = 2 * relsize * ceil(logM(p / sort_mem))
 *      cpu = comparison_cost * t * log2(t)
 *
 * If the sort is bounded (i.e., only the first k result tuples are needed)
 * and k tuples can fit into sort_mem, we use a heap method that keeps only
 * k tuples in the heap; this will require about t*log2(k) tuple comparisons.
 *
 * The disk traffic is assumed to be 3/4ths sequential and 1/4th random
 * accesses (XXX can't we refine that guess?)
 *
 * By default, we charge two operator evals per tuple comparison, which should
 * be in the right ballpark in most cases.  The caller can tweak this by
 * specifying nonzero comparison_cost; typically that's used for any extra
 * work that has to be done to prepare the inputs to the comparison operators.
 *
 * 'pathkeys' is a list of sort keys
 * 'input_cost' is the total cost for reading the input data
 * 'tuples' is the number of tuples in the relation
 * 'width' is the average tuple width in bytes
 * 'comparison_cost' is the extra cost per comparison, if any
 * 'sort_mem' is the number of kilobytes of work memory allowed for the sort
 * 'limit_tuples' is the bound on the number of output tuples; -1 if no bound
 *
 * NOTE: some callers currently pass NIL for pathkeys because they
 * can't conveniently supply the sort keys.  Since this routine doesn't
 * currently do anything with pathkeys anyway, that doesn't matter...
 * but if it ever does, it should react gracefully to lack of key data.
 * (Actually, the thing we'd most likely be interested in is just the number
 * of sort keys, which all callers *could* supply.)
 */

部分实际计算 - 磁盘,堆排序,快速排序。现在没有关于平行排序的估算(https://wiki.postgresql.org/wiki/Parallel_Internal_Sorthttps://wiki.postgresql.org/wiki/Parallel_External_Sort)?

...
    path->rows = tuples;

    /*
     * We want to be sure the cost of a sort is never estimated as zero, even
     * if passed-in tuple count is zero.  Besides, mustn't do log(0)...
     */
    if (tuples < 2.0)
        tuples = 2.0;

    /* Include the default cost-per-comparison */
    comparison_cost += 2.0 * cpu_operator_cost;

.. 
    if (output_bytes > sort_mem_bytes)
    {
...
        /*
         * We'll have to use a disk-based sort of all the tuples
         */
        /*
         * CPU costs
         *
         * Assume about N log2 N comparisons
         */
        startup_cost += comparison_cost * tuples * LOG2(tuples);


        /* Disk costs */

        /* Compute logM(r) as log(r) / log(M) */
        if (nruns > mergeorder)
            log_runs = ceil(log(nruns) / log(mergeorder));
        else
            log_runs = 1.0;
        npageaccesses = 2.0 * npages * log_runs;
        /* Assume 3/4ths of accesses are sequential, 1/4th are not */
        startup_cost += npageaccesses *
            (seq_page_cost * 0.75 + random_page_cost * 0.25);
    }
    else if (tuples > 2 * output_tuples || input_bytes > sort_mem_bytes)
    {
        /*
         * We'll use a bounded heap-sort keeping just K tuples in memory, for
         * a total number of tuple comparisons of N log2 K; but the constant
         * factor is a bit higher than for quicksort.  Tweak it so that the
         * cost curve is continuous at the crossover point.
         */
        startup_cost += comparison_cost * tuples * LOG2(2.0 * output_tuples);
    }
    else
    {
        /* We'll use plain quicksort on all the input tuples */
        startup_cost += comparison_cost * tuples * LOG2(tuples);
    }

    /*
     * Also charge a small amount (arbitrarily set equal to operator cost) per
     * extracted tuple.  We don't charge cpu_tuple_cost because a Sort node
     * doesn't do qual-checking or projection, so it has less overhead than
     * most plan nodes.  Note it's correct to use tuples not output_tuples
     * here --- the upper LIMIT will pro-rate the run cost so we'd be double
     * counting the LIMIT otherwise.
     */
    run_cost += cpu_operator_cost * tuples;