对于表stats_off
中的一个表svv_table_info
列,其值为99%。这是什么意思?以及如何解决它?
我尝试了这张桌子的anaylse和真空历史。 Analyze和Vacuum是否对此列值起任何作用?
答案 0 :(得分:6)
VACUUM
命令将审核该表并根据需要重新排列磁盘上的数据,这将影响unsorted
和empty
列。接近0越好。
ANALYZE
命令将审核该表并重新计算相应的统计信息,这将影响stats_off
列。接近0越好。
即使在运行ANALYZE命令后,它也可能没有太大变化。要最大化可能的最低值,应首先运行VACUUM命令。表的统计信息包括已删除的旧记录 - 在Redshift中,它们只是被跳过,但它们仍会对整体查询性能产生影响。因此,首先在表上运行VACUUM,您将为ANALYZE命令提供可用数据的最佳视图。
仅仅因为表格的统计数据陈旧并不意味着它必然会导致问题。您需要查找的是来自查询计划生成器的警报,以查看它是否在抱怨桌面上的统计信息。您通常会在执行表连接时看到这些投诉。此查询将查看是否在最后一天注册了任何这些投诉,并提供了在需要时运行的命令列表...
SELECT DISTINCT 'ANALYZE ' + feedback_tbl.schema_name + '.' + feedback_tbl.table_name + ';' AS command
FROM ((SELECT
TRIM(n.nspname) schema_name,
c.relname table_name
FROM (SELECT
TRIM(SPLIT_PART(SPLIT_PART(a.plannode, ':', 2), ' ', 2)) AS Table_Name,
COUNT(a.query),
DENSE_RANK()
OVER (
ORDER BY COUNT(a.query) DESC) AS qry_rnk
FROM stl_explain a,
stl_query b
WHERE a.query = b.query
AND CAST(b.starttime AS DATE) >= dateadd(DAY, -1, CURRENT_DATE)
AND a.userid > 1
AND a.plannode LIKE '%%missing statistics%%'
AND a.plannode NOT LIKE '%%_bkp_%%'
GROUP BY Table_Name) miss_tbl
LEFT JOIN pg_class c ON c.relname = TRIM(miss_tbl.table_name)
LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE miss_tbl.qry_rnk <= 25)
-- Get the top N rank tables based on the stl_alert_event_log alerts
UNION
SELECT
schema_name,
table_name
FROM (SELECT
TRIM(n.nspname) schema_name,
c.relname table_name,
DENSE_RANK()
OVER (
ORDER BY COUNT(*) DESC) AS qry_rnk,
COUNT(*)
FROM stl_alert_event_log AS l
JOIN (SELECT
query,
tbl,
perm_table_name
FROM stl_scan
WHERE perm_table_name <> 'Internal Worktable'
GROUP BY query,
tbl,
perm_table_name) AS s ON s.query = l.query
JOIN pg_class c ON c.oid = s.tbl
JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE l.userid > 1
AND l.event_time >= dateadd(DAY, -1, CURRENT_DATE)
AND l.Solution LIKE '%%ANALYZE command%%'
GROUP BY TRIM(n.nspname),
c.relname) anlyz_tbl
WHERE anlyz_tbl.qry_rnk < 25) feedback_tbl
JOIN svv_table_info info_tbl
ON info_tbl.schema = feedback_tbl.schema_name
AND info_tbl.table = feedback_tbl.table_name
WHERE info_tbl.stats_off :: DECIMAL(32, 4) > 10 :: DECIMAL(32, 4)
AND TRIM(info_tbl.schema) = 'public'
ORDER BY info_tbl.size ASC;
当我们查看它时,此查询将查看VACUUM命令的表...
SELECT 'VACUUM FULL ' + "schema" + '.' + "table" + ';' AS command
FROM svv_table_info
WHERE (unsorted > 5 OR empty > 5)
AND size < 716800;
这些查询包含亚马逊定义的建议阈值,并在其公共Python脚本中提供,用于管理Redshift集群located here。