我有一个表,该表具有一个称为ID的Identity列和另一个引用另一个表的名为DateID的列。
日期列用于联接,但ID列具有更多基数。
“ ID”列的唯一计数:657167 DateID列的唯一计数:350
任何人都可以提供有关哪个列将是分发密钥更好选择的见解吗?
*还涉及另一个问题: 我在选择表中的sort和dist键时遇到了一个难题。 排序键 选择排序键时应该考虑基数吗?
另一个与这个旧问题相关的问题合并了。
P.S。我读了一些有关选择dist键的文章,他们说我应该使用与其他表结合使用并且具有更大基数的列。
SELECT SP.*,
CP.*,
TV.*
FROM
(
SELECT * --> there are about 20 aggregation statements in the select statement
FROM FactCustomer f -- contains about 600K records
JOIN DimDate d -- contains about 700 records
ON f.DateID = d.DateID
JOIN DimTime t -- contains 24 records
ON f.TimeID = t.HourID
JOIN DimSalesBranch s -- contains about 64K records
ON f.BranchID = s.BranchID
WHERE s.BranchID IN ( 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 )
AND d.DateTimeInfo >= (CASE
WHEN s.OpeningDate > '2018-01-01' THEN
s.OpeningDate
ELSE
'2018-01-01'
END
)
AND d.DateTimeInfo <= '2018-12-31'
AND StartHour >= 9
AND starthour > 0
AND (EndHour <= 22)
) SP
LEFT JOIN
(
SELECT * --> there are about 20 aggregation statements in the select statement
FROM FactCustomer f
JOIN DimDate d
ON f.DateID = d.DateID
JOIN DimTime t
ON f.TimeID = t.HourID
JOIN DimSalesBranch s
ON f.BranchID = s.BranchID
WHERE s.BranchID IN ( 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 )
AND d.DateTimeInfo >= (CASE
WHEN s.OpeningDate > '2018-01-01' THEN
s.OpeningDate
ELSE
'2018-01-01'
END
)
AND d.DateTimeInfo <= '2018-09-16'
AND StartHour >= 9
AND (EndHour <= 22)
) CP
ON SP.StartDate = CP.StartDate_CP
AND SP.EndDate = CP.EndDate_CP
LEFT JOIN
(
SELECT * --> there are about 6 aggregation statements in the select statement
FROM FactSalesTargetBranch f
JOIN DimDate d
ON f.DateID = d.DateID
JOIN DimSalesBranch s
ON f.BranchID = s.BranchID
WHERE s.BranchID IN ( 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 )
AND d.DateTimeInfo >= (CASE
WHEN s.OpeningDate > '2018-01-01' THEN
s.OpeningDate
ELSE
'2018-01-01'
END
)
AND d.DateTimeInfo <= '2018-09-16'
) TV
ON SP.StartDate = TV.StartDate_TV
AND SP.EndDate = TV.EndDate_TV;
任何见解都值得赞赏。
致谢。
答案 0 :(得分:0)
在这种情况下
通常,“偶数”分布是一个不错的选择,除非您需要将大型表连接在一起,否则它会给您带来最佳效果。
请参阅https://docs.aws.amazon.com/redshift/latest/dg/c_choosing_dist_sort.html