presto 查询太慢,如何提高速度?

时间:2021-02-26 09:38:36

标签: presto

我的代码:

SELECT * FROM (SELECT ROW_NUMBER() over(ORDER BY sn desc,sn ) as Row, 
sn as "sn",rsl_name as "rslName",mec_name as "mecName",app_name as "appName",app_ver as "appVer" 
FROM 
hive.bps.parameter_info) T 
where T.Row between 1 and 20;

语句耗时10多秒,想2秒内拿到数据; 这是分析:

presto> explain analyze  SELECT * FROM (SELECT ROW_NUMBER() over(ORDER BY sn desc,sn ) as Row, sn as "sn",rsl_name as "rslName",mec_name as "mecName",app_name as "appName",app_ver as "appVer" FROM hive.bps.parameter_info) T where T.Row between 1 and 20;
                                                                                                                                                         Query Plan
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Fragment 1 [SINGLE]
     CPU: 92.20ms, Scheduled: 211.99ms, Input: 21980 rows (2.16MB); per task: avg.: 21980.00 std.dev.: 0.00, Output: 20 rows (2.07kB)
     Output layout: [rsl_name, mec_name, app_name, app_ver, sn, row_number]
     Output partitioning: SINGLE []
     Stage Execution Strategy: UNGROUPED_EXECUTION
     - Filter[filterPredicate = row_number BETWEEN (BIGINT 1) AND (BIGINT 20)] => [rsl_name:varchar, mec_name:varchar, app_name:varchar, app_ver:varchar, sn:varchar, row_number:bigint]
             CPU: 0.00ns (0.00%), Scheduled: 1.00ms (0.00%), Output: 20 rows (2.07kB)
             Input avg.: 1.25 rows, Input std.dev.: 387.30%
         - LocalExchange[ROUND_ROBIN] () => [rsl_name:varchar, mec_name:varchar, app_name:varchar, app_ver:varchar, sn:varchar, row_number:bigint]
                 CPU: 0.00ns (0.00%), Scheduled: 0.00ns (0.00%), Output: 20 rows (2.07kB)
                 Input avg.: 20.00 rows, Input std.dev.: 0.00%
             - TopNRowNumber[partition by (), order by (sn DESC_NULLS_LAST) limit 20] => [rsl_name:varchar, mec_name:varchar, app_name:varchar, app_ver:varchar, sn:varchar, row_number:bigint]
                     CPU: 14.00ms (0.05%), Scheduled: 49.00ms (0.06%), Output: 20 rows (2.07kB)
                     Input avg.: 21980.00 rows, Input std.dev.: 0.00%
                     row_number := row_number()
                 - LocalExchange[SINGLE] () => [rsl_name:varchar, mec_name:varchar, app_name:varchar, app_ver:varchar, sn:varchar]
                         CPU: 17.00ms (0.06%), Scheduled: 71.00ms (0.09%), Output: 21980 rows (2.16MB)
                         Input avg.: 1373.75 rows, Input std.dev.: 329.25%
                     - RemoteSource[2] => [rsl_name:varchar, mec_name:varchar, app_name:varchar, app_ver:varchar, sn:varchar]
                             CPU: 22.00ms (0.08%), Scheduled: 29.00ms (0.04%), Output: 21980 rows (2.16MB)
                             Input avg.: 1373.75 rows, Input std.dev.: 329.25%

 
 
 
 Fragment 2 [SOURCE]
     CPU: 27.16s, Scheduled: 1.19m, Input: 829722 rows (54.99MB); per task: avg.: 414861.00 std.dev.: 867.00, Output: 21980 rows (2.16MB)
     Output layout: [rsl_name, mec_name, app_name, app_ver, sn]
     Output partitioning: SINGLE []
     Stage Execution Strategy: UNGROUPED_EXECUTION
     - TopNRowNumber[partition by (), order by (sn DESC_NULLS_LAST) limit 20] => [rsl_name:varchar, mec_name:varchar, app_name:varchar, app_ver:varchar, sn:varchar]
             CPU: 681.00ms (2.51%), Scheduled: 1.25s (1.51%), Output: 21980 rows (2.16MB)
             Input avg.: 754.98 rows, Input std.dev.: 3.78%
             row_number := row_number()
         - TableScan[TableHandle {connectorId='hive', connectorHandle='HiveTableHandle{schemaName=bps, tableName=parameter_info, analyzePartitionValues=Optional.empty}', layout='Optional[bps.parameter_info{}]'}, gro
                 CPU: 26.45s (97.30%), Scheduled: 1.35m (98.30%), Output: 829722 rows (54.99MB)
                 Input avg.: 754.98 rows, Input std.dev.: 3.78%
                 LAYOUT: bps.parameter_info{}
                 rsl_name := rsl_name:string:2:REGULAR
                 mec_name := mec_name:string:3:REGULAR
                 sn := sn:string:8:REGULAR
                 app_name := app_name:string:4:REGULAR
                 app_ver := app_ver:string:5:REGULAR
                 Input: 829722 rows (54.99MB), Filtered: 0.00%
                 
                 

其他: 表为 829722 行,6000 列;存储为兽人;

当我提取几个重要的列创建一个新表,然后使用presto查询,速度更快,ORC按列存储不行吗?

我该怎么做才能加快速度?

0 个答案:

没有答案