如何优化限制查询以更快地从巨大的表访问数据?

时间:2018-11-12 15:04:16

标签: php mysql sql performance datatable

我正在尝试从9 GB以上大小的表中获取数据,并拥有数百万条记录。我正在用该数据填充DataTable。我从表中以块的形式获取记录,即通过Ajax和SQL Limit查询每页10条记录。

pagination

在上图中,您可以看到我们有server="localhost"页,因此当我尝试访问最后一页时,查询将花费很多时间来加载数据。但是,当我尝试访问首页时,数据加载会更快。但是直接访问更高偏移量的页面需要花费很多时间。

223,740

PHP函数/控制器:

 public static function getAllEvaluationsWithNameForDataTable($start){
        $queryBuilder = new Builder();

        return  $queryBuilder
            ->from(array('e' =>  static::class))
            ->leftJoin('Cx\Framework\Models\Common\User\CxUser',  'e.cx_hc_user_id = u.id', 'u')
            ->columns('e.id, e.first_name, u.initials as assigned_coach, e.gender, e.email, e.phone, e.age, e.version, e.evaluation_status, e.ip_address, e.date_created, e.date_updated')
            ->orderBy('e.id asc')
            ->limit(10, $start)
            ->getQuery()
            ->execute()
            ->toArray();
}

Javascript:

public function getEvaluationsAction() {
        // Enable Json response
        $this->setJsonResponse();
        // This action can be called only via ajax
        $this->requireAjax();

        // Forward to access denied if current user is not allowed to view evaluation details
        if (!$this->CxAuth->currentUserIsAllowedTo('VIEW', CxEbEvaluation::getClassResourceName()))
            return $this->forwardToAccessDeniedError();

        if(isset($_GET['start'])){
            $start = $this->request->get('start');
        }else{
            $start = 10;
        }

        $recordsTotal = count(CxEbEvaluation::getAllForDataTable(array('id')));

        //Get Evaluations from DB
        $evaluation_quizzes = CxEbEvaluation::getAllEvaluationsWithNameForDataTable(intval($start));

        //for getting base URL
        $url = new Url();

        $data = array();

        foreach ($evaluation_quizzes as $key => $quiz) {
            $data[ $key ][ 'id' ] = $quiz[ 'id' ];
            $data[ $key ][ 'first_name' ] = $quiz[ 'first_name' ];
            if($quiz[ 'assigned_coach' ]){
                $data[ $key ][ 'assigned_coach' ] = $quiz['assigned_coach'];
            }else{
                $data[ $key ][ 'assigned_coach' ] = "Not assigned";
            }

            $data[ $key ][ 'gender' ] = $quiz[ 'gender' ];
            $data[ $key ][ 'email' ] = $quiz[ 'email' ];
            $data[ $key ][ 'phone' ] = $quiz[ 'phone' ];
            $data[ $key ][ 'age' ] = $quiz[ 'age' ];
            $data[ $key ][ 'version' ] = $quiz[ 'version' ];
            $data[ $key ][ 'quiz' ] =  $url->get('/admin/get-evaluation-quiz-by-id');
            $data[ $key ][ 'manage-notes-messages-and-calls' ] =  $url->get('/admin/manage-notes-messages-and-calls');
            $data[ $key ][ 'date_created' ] = date("m/d/Y H:i:s", $quiz[ 'date_created' ]);
            $data[ $key ][ 'evaluation_status' ] = $quiz[ 'evaluation_status' ];
        }
        // Return data array
        return array(
            "recordsTotal"    => $recordsTotal,
            "recordsFiltered" => $recordsTotal ,
            "data"            => $data //How To Retrieve This Data
        );
        // Return data
    }

我希望问题清楚。如果需要其他任何内容,请告诉我。

(来自评论)

cx.common.data.cxAdminDataTables.EbEvaluation = $CxRecordsTable.cxAdminDataTable({
        ajaxUrl: '<?php echo $this->CxHelper->Route('eb-admin-get-evaluations')?>' + eqQuizIdQueryString,
        serverSide: true,
        processing: true,
        recordsFiltered :true,
        columns: [
            cx.common.admin.tableEditColumn('id',{ delete: true }),
            { data: 'first_name' },
            { data: 'assigned_coach' },
            { data: 'gender' },
            { data: 'email' },
            { data: 'phone' },
            { data: 'age' },
            cx.common.admin.tableLinkColumn('quiz', quizLinkOptions),
            cx.common.admin.tableEditColumn('id', healthCoachLinkOptions),
            cx.common.admin.tableLinkColumn('manage-notes-messages-and-calls', manageNotesMessagesAndCalls),
            { data: 'date_created' },
            cx.common.admin.tableSwitchableColumn('evaluation_status', {
                editable: true,
                createdCell: function (td, cellData, rowData, row, col){
                    $(td).data('evaluation-status-id', rowData.id);
                },
                onText: 'Complete',
                offText: 'In progress'
            })
        ],
        toolbarOptions:{
            enabled: false
        },          success: function (data) {
                            cx.common.data.cxAdminDataTables.EbEvaluation.cxAdminDataTable("reloadAjax");
                        }
                    });
                }
                else {
                    $row.removeClass('alert');
                }
            });
        }
    });

3 个答案:

答案 0 :(得分:4)

由Masivuye Cokile链接的Why does MYSQL higher LIMIT offset slow the query down?问题和答案以及此处提供的https://explainextended.com/2009/10/23/mysql-order-by-limit-performance-late-row-lookups/链接包含了一个很好的总结,说明了为什么大偏移量查询很慢。基本上,对于LIMIT 150000, 10,MySQL仍会扫描整个150000行,即使稍后将其丢弃也是如此。要加快速度,您可以:

  • 使用顺序分页,即“在ID #N之后显示10个条目”,它的运行速度非常快,是一种不错的选择,但会丢弃实际的页码;您的用户将获得“下一个/上一个”链接和/或您可以使用count查询计算出的大致页码。
  • 或在id上创建索引,然后强制mysql执行仅索引搜索。

对于第二种方法,您必须从

重写查询
SELECT ... 
  FROM table t 
WHERE ...
ORDER by t.id ASC
LIMIT 150000, 10

SELECT  ...
  FROM (
        SELECT  id
        FROM    table
        ORDER BY
                id ASC
        LIMIT 150000, 10
        ) o
JOIN table t
  ON t.id = o.id
WHERE ...
ORDER BY t.id ASC

或者,由于您不仅限于单个查询,因此可以使用

来检索页面上第一项的ID。
SELECT id 
  FROM table 
 ORDER BY id ASC 
 LIMIT 150000, 1

然后使用上述ID检索实际数据:

SELECT ...
  FROM table
 WHERE id >= $id
   AND ...
 ORDER BY id ASC
 LIMIT 0, 10

答案 1 :(得分:2)

模式SELECT whatever FROM vast_table ORDER BY something LIMIT 10 large_number是一个臭名昭著的性能反模式。为什么?因为它必须检查很多行才能返回几行。

如果您的id值是主键(或任何索引列),则可以按

分页
SELECT whatever FROM vast_table WHERE id BETWEEN large_value AND large_value+9 ORDER BY id;

或者您可以尝试

SELECT whatever FROM vast_table WHERE id >= large_value ORDER BY id LIMIT 10;

如果您的id值中有空格,这将不能完美地分页。但是它的表现可以接受。

答案 2 :(得分:1)

问题与我表中的dates列数据类型有关。我在日期字段中使用int数据类型,而当我将日期列的数据类型更改为datetime时,搜索结果以秒为单位。

我在http://dbscience.blogspot.com/2008/08/can-timestamp-be-slower-than-datetime.html找到解决方案的来源