Question

我们说我有三个表：t1（它有大约10亿行，事实表）和t2（空表，0行）。和t0（维度表），所有这些都有正确收集的统计信息。此外，还有视图v0：

REPLACE VIEW v0
AS SELECT * from t1
   union
   SELECT * from t2;

让我们看看这三个问题：

1) Select * from t1 inner t0 join on t1.id = t0.id; -- Optimizer correctly estimates 1 bln rows

2) Select * from t2 inner t0 join on t1.id = t0.id; -- Optimizer correctly estimates 0 row

3) Select * from v0 inner t0 join on v0.id = t0.id;  -- Optimizer locks t1 and t2 for read, the correctly estimated, that it will get 1 bln rows from t1, but for no clear reasons estimated same number 1 bln from table t2.

这里发生了什么？它是错误还是功能？

PS。原始查询，这里显示的很大，在35分钟内没有完成。离开t1后 - 在15分钟内成功完成。

道明发布：15.10.03.07
道明版：15.10.03.09

Answer 1

它与第二选择的数字不同，它是第二选择后的中的总行数，即10亿加0。

你的查询运行缓慢是因为你使用的UNION默认为DISTINCT，在十亿行上运行它真的很贵。

最好切换到UNION ALL。

Teradata优化器错误地估计行号，然后使用union

1 个答案: